Assessing ChatGPT’s capacity for clinical decision support in pediatrics: A comparative study with pediatricians using KIDMAP of Rasch analysis

Hsu-Ju KAO, Tsair-Wei CHIEN, Wen Chung WANG, Willy CHOU, Julie Chi CHOW

Research output: Contribution to journalArticlespeer-review

11 Citations (Scopus)

Abstract

Background: The application of large language models in clinical decision support (CDS) is an area that warrants further investigation. ChatGPT, a prominent large language models developed by OpenAI, has shown promising performance across various domains. However, there is limited research evaluating its use specifically in pediatric clinical decision-making. This study aimed to assess ChatGPT’s potential as a CDS tool in pediatrics by evCDSaluating its performance on 8 common clinical symptom prompts. Study objectives were to answer the 2 research questions: the ChatGPT’s overall grade in a range from A (high) to E (low) compared to a normal sample and the difference in assessment of ChatGPT between 2 pediatricians.
Methods: We compared ChatGPT’s responses to 8 items related to clinical symptoms commonly encountered by pediatricians. Two pediatricians independently assessed the answers provided by ChatGPT in an open-ended format. The scoring system ranged from 0 to 100, which was then transformed into 5 ordinal categories. We simulated 300 virtual students with a normal distribution to provide scores on items based on Rasch rating scale model and their difficulties in a range between −2 to 2.5 logits. Two visual presentations (Wright map and KIDMAP) were generated to answer the 2 research questions outlined in the objectives of the study.
Results: The 2 pediatricians’ assessments indicated that ChatGPT’s overall performance corresponded to a grade of C in a range from A to E, with average scores of −0.89 logits and 0.90 logits (=log odds), respectively. The assessments revealed a significant difference in performance between the 2 pediatricians (P < .05), with scores of −0.89 (SE = 0.37) and 0.90 (SE = 0.41) in log odds units (logits in Rasch analysis).
Conclusion: This study demonstrates the feasibility of utilizing ChatGPT as a CDS tool for patients presenting with common pediatric symptoms. The findings suggest that ChatGPT has the potential to enhance clinical workflow and aid in responsible clinical decision-making. Further exploration and refinement of ChatGPT’s capabilities in pediatric care can potentially contribute to improved healthcare outcomes and patient management. Copyright © 2023 the Author(s).
Original languageEnglish
Article numbere34068
JournalMedicine
Volume102
Issue number25
DOIs
Publication statusPublished - Jun 2023

Citation

Kao, H.-J., Chien, T.-W., Wang, W.-C., Chou, W., & Chow, J. C. (2023). Assessing ChatGPT’s capacity for clinical decision support in pediatrics: A comparative study with pediatricians using KIDMAP of Rasch analysis. Medicine, 102(25), Article e34068. http://dx.doi.org/10.1097/MD.0000000000034068

Keywords

  • Artificial intelligence
  • ChatGPT
  • KIDMAP
  • Logit
  • Pediatrics
  • Rasch analysis
  • Wright Map

Fingerprint

Dive into the research topics of 'Assessing ChatGPT’s capacity for clinical decision support in pediatrics: A comparative study with pediatricians using KIDMAP of Rasch analysis'. Together they form a unique fingerprint.