Authors' Response to Peer Reviews of “COVID-19 Outcomes and Genomic Characterization of SARS-CoV-2 Isolated From Veterans in New England States: Retrospective Analysis”

<jats:p />

from the United Kingdom, B.1.1.28 from Brazil, and B.1.351 from South Africa [13] warrant constant new data and knowledge translation. To this effect, this paper addresses a major area of concern and interest to the readership of the journal. The authors are clear in their title, which still needs to fully comply with the journal guidelines. The Abstract follows the guidelines and presents an overview of the study. Being an area that has received tremendous interest since the start of the COVID-19 pandemic, there was an overriding need for this study to be put in context. The paper's introduction does well, ends with the study aim, and is brief at highlighting the main concern but deserves more attention. The general structure of the paper needs improvement to comply with the journal guidelines. The data collection methods, albeit needing clarification, seem reasonable with appropriate analysis, thereby giving value to the results. The discussion of the paper has been well articulated, and the conclusion ties with the research objective. The English used is simple and in plain language for easy comprehension.
Although congratulating the authors for a good attempt and concise paper, the paper will benefit from more value if the following specific comments are given consideration.
Response: We thank the reviewer for the careful review of this paper and the summary. Their feedback has certainly improved this manuscript. Our response to each comment is in the following sections. 1) Summarize the evidence already reported on the topic 2) Report why this study was necessary and the value added to the existing literature 3) The implication of all available evidence (including that from this study) Response: We thank the reviewer for this comment. It has certainly made our Introduction section stronger. We have added references and language on the implication and relevance of evidence in this area of work in the field of COVID-19. Please see our full introduction for the additions and modifications, pages 2-3, lines 60-98. 5. It will be good to structure your Introduction into Background, Study Rationale, and Study Aim.
Response: We have added these subheadings to our Introduction section. 6. Kindly structure your Methods section and report it as follows:

11) Ethical considerations
Response: We have structured the Methods section with all these subheadings. Specific comments about methods will be addressed in the following points. 7. It is not clear whether this was a retrospective study since patients were still hospitalized at the time of this study. In 6.2 above, kindly be precise about the type of cohort study you undertook.
Response: This was a retrospective study, which we have now stated clearly in our Methods section, page 6, line 106: "We conducted a retrospective chart review to gather the demographic and clinical variables." For further clarification of our methods, we included the following in our Methods section, page 7, lines 131-134: "All data collection was retrospective after a diagnosis of COVID-19 had been confirmed. If chart review occurred while a veteran was hospitalized, the chart was again reviewed retrospectively after discharge from hospital." 8. As part of your participant recruitment, indicate attempts made to reduce bias.
Response: This was a retrospective study, so we did not recruit participants. As indicated in our Methods section, we included all veterans who were diagnosed with COVID-19 in this era with accessible medical records. 9. In 6.6 above, give details of those that collected samples and how that was done. If this was done by your research team, ensure to report the protocol used to collect samples. Organize your data collection into: 10. In 6.7 above, kindly clarify how samples were handled (including storage). If this was not done by the research team and was only reported, kindly indicate as such. If samples were not collected by you, provide details on how you had access to samples.
Response: This has been clarified now in the subsection Sample Collection and Handling, page 7, lines 136-144: "Sample collection and handling: Handling of nasopharyngeal specimens or isolated virus was carried out by the VACHS clinical laboratory as part of clinical care, following standardized CLIA guidelines. Our viral repository was populated by the positive test results of all New England veterans. VACHS laboratory handled specimens, isolated the SARS-CoV-2 RNA, and shipped it for whole genome sequencing (WGS) to non-VA laboratory. We obtained the details of platform used to diagnose, the cycle threshold, and the date of test from the laboratory. Sequencing of viral genome was conducted at the non-VA laboratory by our co-authors as follows." 11. In 6.9 above, it is important to report the protocol/guidelines you used in genome sequencing. You may want to justify your procedure using these WHO guidelines [16] as well as substantiating your procedure with a visual display/flow of how the sequencing works.
Response: We thank the reviewer for this comment and would like to provide clarification. The genome sequencing method and the alignment approach are defined clearly in the subsection on WGS. Assignment of lineages was with Pangolin as described. Citations have been provided for reference. Any further granular detail on this method would be out of the scope of this paper.
12. As part of your statistical analysis, could you please justify your use of nonparametric tests? Kindly report the normality tests that were performed and the figures.
Response: We used logistic regressions to model the outcomes of hospitalization and mortality, and ordinal logistic regression to model peak disease severity because the outcomes were categorical and ordinal, respectively. Logistic and original logistic regressions do not require an assumption of normality. We have edited our paper to make this clearer, page 8, lines 168 and 169: "We used STATA v16 (College Station, TX) for logistic regressions to predict our hospitalization and mortality, and ordinal logistic regression to predict peak disease severity." 13. It might be worth arranging your data analysis first into univariate analysis and multivariate analysis, and then into hospitalization, peak disease severity, mortality, and genome sequencing.
Response: We have rephrased our Methods section to make the structure of analysis more clear, pages 8 and 9, lines 168-172: "We used STATA v16 (College Station, TX) for logistic regressions to predict our hospitalization and mortality, and ordinal logistic regression to predict peak disease severity. We first conducted a univariate analysis, then used significant variables from the univariate analysis (P< 0.05) to use in a multivariate model for each of our outcomes to assess the impact of several variables at once, which has been frequently used in COVID-19 literature. Genomic characteristics were reported descriptively." 14. In your data analysis, kindly report how you moved from univariate to multivariate analysis or how you selected variables for your multivariate model.
Response: We agree with the reviewer that more clarification is necessary, so we have described our methods in more detail, Page 8 and 9, lines 169-172: "We first conducted a univariate analysis, then used significant variables from the univariate analysis (P< 0.05) to use in a multivariate model for each of our outcomes to assess the impact of several variables at once, which has been frequently used in COVID-19 literature." 15. It is very important to indicate the guidelines used to report your review results. As part of your ethical considerations, indicate the guidelines you used to report your results. You may want to use these depending on which best suits your study method [17,18].
Response: We thank the reviewer. We have cited the Record statement for this. Our report follows those guidelines, page 9, lines 181 and 182: "RECORD statement guidelines were used to maintain transparency in the reporting of this work."   19. Note that the whole of your manuscript must be in portrait. You may want to highlight your Table 1 then click on "fit to window" on the automatic adjustment tab of Microsoft Word and move it together with Figure 1 to the Genomic Sequencing subsection of your Results section.
Response: We thank the reviewer for this comment, and we have adjusted Table 1 so that it fits within a portrait page. 20. In the presentation of the results of your logistic regression, it will be good to state how the following assumptions were met: 1) Binary outcome 2) Linearity

4) Multicollinearity
Response: We thank the reviewer for the comment and have included the following sentence in the Methods section, page 9, lines 173 and 174: "Assumptions for logistic regressions (binary outcome, linearity, no outliers, and multicollinearity) were tested and met, with maximum variance inflation factors of 2." 21. As part of the reported results of your regression, I suggest proving an explanation on your model's goodness of fit by plotting and reporting the area under the receiver operating characteristic (ROC) curve.
Response: We agree with the reviewer, and we have provided the area under the ROC curve (the C-statistic) for our multivariate models in the text of the Results section, page 11, lines 207-213: "In multivariate regression, significant predictors of hospitalization (C-statistic: 0.75) were age (OR: 1.05, 95% CI: 1.03, 1.08) and non-White race (OR: 2.39, 95% CI: 1.13, 5.01) ( 23. Include a subsection "Author Contribution" after the Acknowledgments section to state the contribution of each author included in this paper.
Response: We have included author contributions on page 19, lines 348-354: "Author contributions: The authors confirm contribution to the manuscript as follows: ML and SG participated in the conception, design, data collection, analysis and interpretation of results, and manuscript preparation. YHS and MR participated in the data collection, analysis and interpretation of results, and manuscript preparation. MEP and NDG participated in the conduction, analysis and interpretation of whole genome sequencing, and in manuscript preparation. DC participated in the data collection, analysis and interpretation of results. CBFV, JRF and TA participated in the conduction and analysis of whole genome sequencing." 24. Include a subsection "Conflicts of Interest" after "Author Contributions" to declare any conflict of interest.  32. In your Discussion section, it will be appropriate to organize the "Comparison With Prior Studies" into subtitles as follows:

1) Predictors of hospitalization
2) Predictors of peak disease severity

3) Predictors of mortality 4) Genomic sequencing
Response: We thank the reviewer for this comment. We considered this but found that dividing the first part of the discussion into these four subheadings would result in small subsections. We instead took the reviewer's prior suggestion of dividing the discussion into three subsections: Principal Findings, Comparison With Prior Studies, and Limitations. Our Discussion section has been strengthened by this.
33. I suggest starting your conclusion with a statement on the study objectives followed by a summary of findings, then lessons learned from your findings, and finally suggested direction of future research.
Response: We thank the reviewer for the suggestion and have reframed the first paragraph of our introduction to fit with the reviewer's suggestions, pages 15 and 16, lines 288-300: "Our study found that in a cohort of veterans with average age of 63 years and a high comorbidity burden, age significantly associated with risk of hospitalization, peak disease severity, and mortality. O2 requirement upon admission correlated with peak disease severity and mortality, while dementia was an additional factor associated with higher mortality. The CDC provides a list of chronic medical conditions (May 2021) that predispose individuals to severe illness from SARS-CoV-2 infection [21], but >75% of United States adults fall under a high-risk category [22]. Veterans are a unique cohort because of advanced age on average [23], and more comorbidities. Understanding clinical factors that impact outcomes in veterans will help healthcare providers risk-stratify patients with similar demographic profiles, and future research should explore the impact of new treatments and vaccination on outcomes. For articles without PMIDs, kindly include a DOI and ensure you verify your DOIs using https://www.doi.org/ to make sure they work.
Response: We have edited our references to include PMIDs whenever available and formatted them according to journal guidelines.
37. For referenced websites, ensure to make as much effort as possible to get and reference the PDF version of the article (ie, in the absence of a PMID and DOI).
Response: We have made every effort to reference PDF versions of articles whenever possible.

General Comments
The authors presented a study about the clinical and genomic characterization of COVID-19 from a veteran group. I have some questions for the authors.
1. Line 85: Authors wrote, "we recorded hospitalization status, mortality, and oxygen (O2)-requirement within 24 hours of admission." Here, can authors clarify if they recorded each single patient's clinical information within 24 hours of admission or they collected them from chart review? In addition, for O2, the 2 should be subscript.
Response: We thank the reviewer for helping us clarify this. We did gather this information from manual chart review and have updated our methods to read, page 8, lines 160 and 161: "Our categorical outcomes, also derived from manual chart review, were hospitalization status, mortality, and oxygen (O2)-requirement within 24 hours of admission from manual chart review." We have also changed O2 throughout the manuscript to have a subscript.
2. Lines 105 and 106: The disease name should be capitalized.
Response: We thank the reviewer for this comment; however, disease names are not typically capitalized unless they are an abbreviation.
3. Line 113: Authors did not provide a transition between the univariate regression and multivariate regression. Univariate analysis was simply mentioned in the first sentence without any explanation or discussion. Authors should indicate the reason why they conducted multivariate analysis (eg, univariate was not specific enough). Additionally, in general, the factors should have the first letter capitalized, for example, Age, non-White race.
Response: We thank the reviewer for this comment. As in our response to reviewer W, we have edited our description and clarified our univariate and multivariate analyses, pages 8 and 9, lines 168-172: "We used STATA v16 (College Station, TX) for logistic regressions to predict our hospitalization and mortality, and ordinal logistic regression to predict peak disease severity. We first conducted a univariate analysis, then used significant variables from the univariate analysis (P < 0.05) to use in a multivariate model for each of our outcomes to assess the impact of several variables at once, which has been frequently used in COVID-19 literature." We have ensured that White and non-White are capitalized where present. Age is usually not capitalized.
4. Line 129: Authors wrote, "our study found that in an older cohort of veterans." Here, older cohort could cause some confusion to some readers. When one reads the paper a few years later, he or she probably cannot understand what the older cohort is related to. Authors can add a time frame to it.
Response: This is a thoughtful comment, and we thank the reviewer for these comments and have added age to help support it, page 15, lines 288-290: "Our study found that in a cohort of veterans with an average age of 63 years and a high comorbidity burden, age significantly associated with risk of hospitalization, peak disease severity, and mortality." 7. Line 137: Authors wrote, "in our study, age was a significant predictor for all of our outcomes and was a confounder for other variables." Most scientific papers are written from the third point of view. Therefore, it is not common to state the study outcomes as "our outcome." Authors should use a better phrase, such as in line 151: "This may explain the outcomes in our study." Response: We agree with the reviewer and have rephrased this sentence to be, page 16, lines 304 and 305: "In our study, age was a significant predictor for all of the studied outcomes and was a confounder for other variables." 8. Line 138: Authors wrote, "interestingly, LTC status predicted all three of our outcomes on univariate analysis, but not on multivariate analyses. Earlier in the COVID-19 pandemic, residents of nursing homes had higher rates of infection as well as severe illness and mortality [25]." There is no transition between these two sentences. The first few sentences in the paragraph discussed age as a predictor. However, the sentence "earlier in the COVID-19 pandemic..." did not show an immediate connection with the age issue. Maybe the authors would like to express that nursing homes have older patients. If this is the case, the authors need to provide some connection or background information here.
Response: We do agree that we were trying to say nursing homes may have older patients. We have connected the two ideas, page 16, lines 304-307: "In our study, age was a significant predictor for all of the studied outcomes and was a confounder for other variables. Accordingly, LTC status predicted all three of our outcomes on univariate analysis, but not on multivariate analyses, possibly because LTC units tend to have older residents." 9. Line 140: Authors wrote that "our study shows that among veterans in LTC facility, disease outcomes were not impacted by their residence status." Here, authors should provide some discussion or reasons for their findings.
Response: We thank the reviewer for pointing this out. We intended to carry on the previous thought that after adjusting for age, residents of a long-term care (LTC) facility did not have worse outcomes. We have reworded this sentence, page 16, lines 308 and 309: "Our study shows disease outcomes were not impacted by their residence status, after adjusting for age." 10. Line 148: Authors wrote, "our study supports data from previous reports that non-White patients are at increased risk of hospitalization but have similar peak severity and mortality outcomes [26][27][28][29]." Are these non-White patients in the United States or in other countries? This could change the dynamic and purpose of citing the reference. Please clarify.
Response: These studies are from the United States, and we have clarified this point on page 17, lines 315-317: "Our study supports data from previous reports that non-White patients in the United States are at increased risk of hospitalization but have similar peak severity and mortality outcomes." 11. Line 156: Authors concluded that, for patients with dementia, they could have a high risk of death because of biological factors. Another possibility is the lack of self-report ability in patients with dementia. As a result, they probably do not understand their body's changes, which could delay the needed care.
Response: We thank the reviewer for this comment and have added in this explanation, page 17, lines 318-321: "This may be explained by a host of biological factors but also may be a result of inability to self-report symptoms. This finding emphasizes the importance of extra care and monitoring required when approaching a patient with dementia." 12. For the Discussion section, authors may add subtitles to different issues they would like to discuss. The current writing may be a little bit confusing to some readers.
Response: We thank the reviewer for this comment and have added subsections entitled, "Principal Findings," "Comparison With Previous Studies," and "Limitations" to our Discussion section.
13. In the Discussion, the authors mentioned multivariate analysis of many potential risk factors as their strength. It is true that the multivariate model is a powerful tool, but it is not necessarily fit for the COVID-19 situation very well. Authors need to cite references about other cases of using the multivariate model for COVID-19 outcome analysis.
Response: We thank the reviewer for this comment and have added several references to other studies using multivariate models after the following sentence in the methods, pages 8 and 9, lines 169-172: "We first conducted a univariate analysis, then used significant variables from the univariate analysis (P< 0.05) to use in a multivariate model for each of our outcomes to assess the impact of several variables at once, which has been frequently used in COVID-19 literature." 15. Figure 1: Authors should provide a better maximum likelihood tree. The current figure has many branches stacked to each other, barely providing any helpful information to readers.
Response: We thank the reviewer for this comment, and we are showing only the branches in which we have a sequence. From this figure, we are hoping to show the diversity of lineages, with the main branch points labeled. For more in-depth information