Authors' Response to Peer-Review Reports: https://med.jmirx.org/2021/2/e28917/
Published Article: https://med.jmirx.org/2021/2/e25560/
This is a peer-review report submitted for the paper “Machine Learning for Risk Group Identification and User Data Collection in a Herpes Simplex Virus Patient Registry: Algorithm Development and Validation Study.”
Round 1 Review
The authors of this research  discuss a platform containing a random forest classifier applied to the medical reports of patients suffering from the herpes virus. The manuscript describes an introduction to the proposed topic, the problem the authors intend to solve, the solution, and a discussion. Although the research seems interesting, the manuscript has some weaknesses that the authors must resolve.
- Authors should read the authors’ guidelines at https://www.jmir.org/content/author-instructions. I suggest that they adapt their manuscript to the templates offered by JMIR; the title does not match the format proposed by the journal, the appendices do not have a caption, the tables can go in the manuscript, etc.
- In relation to the content of the manuscript, there is no exhaustive bibliographic review in which existing studies applied to a classification problem such as the one the authors present are mentioned. Because of this, the justification for the development they propose is quite weak and can be improved upon.
- Authors indicate that they separated the data sets by train_test_split; however, there is no clear description of the content of these two data sets. It is not known whether the classes are balanced or not, and no data preprocessing was done to ensure that the generated model is optimal for any type of data. Authors should indicate if they have done a cross-validation when training their model or not. If not, I recommend that they do it.
- It would be enlightening to show the matrix of confusion as well as to indicate in a table a comparison of the measures of precision and accuracy on random forest with different hyperparameters.
- To search for the best hyperparameters, I suggest using GridSearchCV or similar.
- Finally, it is necessary to make a comparison between the proposed model and others that already exist.
- Authors are requested to upload their code and the models to a repository to guarantee their reproducibility.
Round 2 Review
I thank the authors for their work in improving this manuscript. They have responded correctly to all my suggestions, and I consider that the manuscript has improved in quality and can be considered for publication in this journal.
Conflicts of Interest
Edited by G Eysenbach; This is a non–peer-reviewed article. submitted 18.03.21; accepted 18.03.21; published 11.06.21Copyright
©José Alberto Benítez Andrades. Originally published in JMIRx Med (https://med.jmirx.org), 11.06.2021.
This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIRx Med, is properly cited. The complete bibliographic information, a link to the original publication on https://med.jmirx.org/, as well as this copyright and license information must be included.