<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE article PUBLIC "-//NLM//DTD Journal Publishing DTD v2.0 20040830//EN" "journalpublishing.dtd"><article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" dtd-version="2.0" xml:lang="en" article-type="reviewer-report"><front><journal-meta><journal-id journal-id-type="nlm-ta">JMIRx Med</journal-id><journal-id journal-id-type="publisher-id">xmed</journal-id><journal-id journal-id-type="index">34</journal-id><journal-title>JMIRx Med</journal-title><abbrev-journal-title>JMIRx Med</abbrev-journal-title><issn pub-type="epub">2563-6316</issn><publisher><publisher-name>JMIR Publications</publisher-name><publisher-loc>Toronto, Canada</publisher-loc></publisher></journal-meta><article-meta><article-id pub-id-type="publisher-id">v7i1e96227</article-id><article-id pub-id-type="doi">10.2196/96227</article-id><article-categories><subj-group subj-group-type="heading"><subject>Peer-Review Report</subject></subj-group></article-categories><title-group><article-title>Peer Review of &#x201C;The Performance of DeepSeek R1 and Gemini 3 in Complex Medical Scenarios: Comparative Study&#x201D;</article-title></title-group><contrib-group><contrib contrib-type="author"><name name-style="western"><surname>Kejriwal</surname><given-names>Mayank</given-names></name><xref ref-type="aff" rid="aff1"/></contrib></contrib-group><aff id="aff1"><institution>University of Southern California</institution><addr-line>Los Angeles</addr-line><addr-line>CA</addr-line><country>United States</country></aff><contrib-group><contrib contrib-type="editor"><name name-style="western"><surname>Schwartz</surname><given-names>Amy</given-names></name></contrib></contrib-group><pub-date pub-type="collection"><year>2026</year></pub-date><pub-date pub-type="epub"><day>27</day><month>4</month><year>2026</year></pub-date><volume>7</volume><elocation-id>e96227</elocation-id><history><date date-type="received"><day>26</day><month>03</month><year>2026</year></date><date date-type="rev-recd"><day>26</day><month>03</month><year>2026</year></date><date date-type="accepted"><day>26</day><month>03</month><year>2026</year></date></history><copyright-statement>&#x00A9; Mayank Kejriwal. Originally published in JMIRx Med (<ext-link ext-link-type="uri" xlink:href="https://med.jmirx.org">https://med.jmirx.org</ext-link>), 27.4.2026. </copyright-statement><copyright-year>2026</copyright-year><license license-type="open-access" xlink:href="https://creativecommons.org/licenses/by/4.0/"><p>This is an open-access article distributed under the terms of the Creative Commons Attribution License (<ext-link ext-link-type="uri" xlink:href="https://creativecommons.org/licenses/by/4.0/">https://creativecommons.org/licenses/by/4.0/</ext-link>), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIRx Med, is properly cited. The complete bibliographic information, a link to the original publication on <ext-link ext-link-type="uri" xlink:href="https://med.jmirx.org/">https://med.jmirx.org/</ext-link>, as well as this copyright and license information must be included.</p></license><self-uri xlink:type="simple" xlink:href="https://xmed.jmir.org/2026/1/e96227"/><related-article related-article-type="companion" ext-link-type="doi" xlink:href="10.1101/2025.04.29.25326666" xlink:title="Preprint (medRxiv)" xlink:type="simple">https://www.medrxiv.org/content/10.1101/2025.04.29.25326666v1</related-article><related-article related-article-type="companion" ext-link-type="doi" xlink:href="10.2196/96220" xlink:title="Authors' Response to Peer-Review Reports" xlink:type="simple">https://med.jmirx.org/2026/1/e96220</related-article><related-article related-article-type="companion" ext-link-type="doi" xlink:href="10.2196/76822" xlink:title="Published Article" xlink:type="simple">https://med.jmirx.org/2026/1/e76822</related-article><kwd-group><kwd>large reasoning model</kwd><kwd>LRM</kwd><kwd>large language model</kwd><kwd>LLM</kwd><kwd>accuracy</kwd><kwd>medical scenario</kwd><kwd>DeepSeek R1</kwd><kwd>Gemini 3</kwd></kwd-group></article-meta></front><body><p><italic>This is a peer-review report for &#x201C;The Performance of DeepSeek R1 and Gemini 3 in Complex Medical Scenarios: Comparative Study.&#x201D;</italic></p><sec id="s2"><title>Round 1 Review</title><sec id="s1-1"><title>General Comments</title><p>This paper [<xref ref-type="bibr" rid="ref1">1</xref>] reports on an experimental study to analyze the Massive Multitask Language Understanding Pro (MMLU-Pro) Q&#x0026;A dataset. The authors find that DeepSeek R1 had an accuracy rate of 95.1% in 162 medical scenarios after reconciliation with subject matter experts on 23 questions. The findings contribute to the growing body of knowledge on large language model applications in health care and provide insights into the strengths and limitations of DeepSeek R1 in this domain.</p></sec><sec id="s1-2"><title>Specific Comments</title><sec id="s1-2-1"><title>Major Comments</title><p>1. The results are not appropriately qualified with results on statistical significance, and/or are lacking comparisons with other language models. Even if we know how other models perform overall, it would still be good to have more details, such as a comparison of where one model is right and another is wrong. Those kinds of deep insights are lacking in this paper. All we really know is that DeepSeek performs at a level roughly equivalent to the other leading models (nothing surprising there) and that it sometimes has incomplete or inexplicable behavior. I feel the paper needs to have more results and analysis to be a good fit for this journal.</p><p>2. Maybe you could add a workflow diagram/figure to better illustrate the methods?</p><p>3. I would like Table 1 to be augmented. Perhaps you can add an example question with answer choices? Right now, it looks very trivial. The alternative is to create a simple bar graph instead of a table, but the former would be more useful.</p></sec></sec></sec></body><back><fn-group><fn fn-type="conflict"><p>None declared.</p></fn></fn-group><glossary><title>Abbreviations</title><def-list><def-item><term id="abb1">MMLU-Pro</term><def><p>Massive Multitask Language Understanding Pro</p></def></def-item></def-list></glossary><ref-list><title>References</title><ref id="ref1"><label>1</label><nlm-citation citation-type="journal"><person-group person-group-type="author"><name name-style="western"><surname>Bajwa</surname><given-names>M</given-names> </name><name name-style="western"><surname>Hoyt</surname><given-names>R</given-names> </name><name name-style="western"><surname>Knight</surname><given-names>D</given-names> </name><name name-style="western"><surname>Haider</surname><given-names>M</given-names> </name></person-group><article-title>The performance of DeepSeek R1 and Gemini 3 in complex medical scenarios: comparative study</article-title><source>JMIRx Med</source><year>2026</year><volume>7</volume><fpage>e76822</fpage><pub-id pub-id-type="doi">10.2196/76822</pub-id></nlm-citation></ref></ref-list></back></article>