Chat Generative Pretrained Transformer-4.0's accuracy in assessing cervical vertebrae and hand-wrist maturation stages: A retrospective study
Özet
Introduction: This study aimed to evaluate the diagnostic accuracy of Chat Generative Pretrained Transformer version 4.0 (ChatGPT-4.0) in determining cervical vertebrae and hand-wrist maturation stages using cephalometric and hand-wrist radiographic films. Methods: A retrospective analysis was conducted on 238 subjects who had cephalometric and hand-wrist radiographs taken on the same day. Each hand-wrist maturation stage was independently evaluated by 3 orthodontists using the method described by Bjork and Helm, whereas cervical vertebrae maturation stages were assessed following the methodology proposed by Bacetti and coworkers. These evaluations served as the reference standard for measuring the performance of ChatGPT-4.0. The hand-wrist and cephalometric radiographs were analyzed by ChatGPT-4.0, and the results were recorded by the primary researcher. Results: The model achieved its highest performance in the hand-wrist maturation stages during the RU stage, with an area under the curve (AUC) value of 0.89. However, despite high precision values in the PP3U and MP3U stages, the model exhibited low recall values, indicating that certain positive instances were missed. In other stages, particularly DP3U and MP3CAP, low precision and recall values limited classification accuracy. Regarding cervical vertebral maturation stages (CVS), the model performed best in CVS1 (AUC, 0.82; precision, 0.806), with relatively favorable AUC values observed in CVS2 (AUC, 0.77). However, its predictive performance in CVS3 and CVS6 stages was suboptimal (AUC <0.67). Conclusions: ChatGPT-4.0 demonstrated accurate predictions in the RU and CVS1 stages. However, its overall performance was found to be inferior to that of other artificial intelligence models.
















