Gelişmiş Arama

Basit öğe kaydını göster

dc.contributor.authorEkici, Omer
dc.date.accessioned2025-12-28T16:50:22Z
dc.date.available2025-12-28T16:50:22Z
dc.date.issued2025
dc.identifier.urihttps://doi.org/10.15311/selcukdentj.1674113
dc.identifier.urihttps://hdl.handle.net/20.500.12933/2978
dc.description.abstractBackground: The aim of the study is to evaluate the performance of four leading Large Language Models (LLMs) in the 2021 Dentistry Specialization Exam (DSE). Methods: A total of 112 questions were used, including 39 questions in basic sciences and 73 questions in clinical sciences, which did not include the figures and graphs asked in the 2021 DSE. The study evaluated the performance of four LLMs: Claude-3.5 Haiku, GPT-3.5, Co-pilot, and Gemini-1.5. Results: In basic sciences, Claude-3.5 Haiku and GPT-3.5 answered all questions correctly by 100%, while Gemini-1.5 answered by 94.9% and Co-pilot by 92.3%. In clinical sciences, Claude-3.5 Haiku showed an overall correct answer rate of 89%, Co-pilot 80.9%, GPT-3.5 79.7% and Gemini-1.5 65.7%. For all questions, Claude-3.5 Haiku showed a correct answer rate of 92.85%, GPT-3.5 86.6%, Co-pilot 84.8% and Gemini-1.5 75.9%. While the performance of LLMs in basic sciences was similar (p=0.134), there was a statistically significant difference between the performances of LLMs in clinical sciences and all questions (p=0.007 and p=0.005, respectively). Conclusion: In all questions and clinical sciences, Claude-3.5 Haiku performed best, Gemini-1.5 performed worst, and GPT-3.5 and Co-pilot performed similarly. The 4 LLM models examined showed a higher success rate in basic sciences than in clinical sciences. The results showed that AI-based LLMs can perform well in knowledge-based questions such as basic sciences but perform poorly in questions that require knowledge as well as clinical reasoning, discussion, and interpretation, such as clinical sciences. © 2025, Selcuk University. All rights reserved.
dc.language.isoen
dc.publisherSelcuk University
dc.relation.ispartofSelcuk Dental Journal
dc.rightsinfo:eu-repo/semantics/openAccess
dc.subjectArtificial intelligence
dc.subjectDentistry
dc.subjectDentistry specialization training
dc.subjectLarge language model
dc.titleComparative Evaluation of Four Large Language Models in Turkish Dentistry Specialization Exam
dc.title.alternativeTürk Diş Hekimliği Uzmanlık Sınavında Dört Büyük Dil Modelinin Karşılaştırmalı Değerlendirilmesi
dc.typeArticle
dc.departmentAfyonkarahisar Sağlık Bilimleri Üniversitesi
dc.identifier.doi10.15311/selcukdentj.1674113
dc.identifier.volume12
dc.identifier.issue4
dc.identifier.startpage6
dc.identifier.endpage10
dc.relation.publicationcategoryMakale - Uluslararası Hakemli Dergi - Kurum Öğretim Elemanı
dc.department-tempEkici, Omer, Department of Oral and Maxillofacial Surgery, Afyonkarahisar Health Sciences University, Afyonkarahisar, Afyonkarahisar, Turkey
dc.identifier.scopus2-s2.0-105020858078
dc.identifier.scopusqualityN/A
dc.indekslendigikaynakScopus
dc.snmzKA_Scopus_20251227


Bu öğenin dosyaları:

DosyalarBoyutBiçimGöster

Bu öğe ile ilişkili dosya yok.

Bu öğe aşağıdaki koleksiyon(lar)da görünmektedir.

Basit öğe kaydını göster