Gelişmiş Arama

Basit öğe kaydını göster

dc.contributor.authorYilmaz, Birkan Eyup
dc.contributor.authorYilmaz, Busra Nur Gokkurt
dc.contributor.authorOzbey, Furkan
dc.date.accessioned2025-12-28T16:40:23Z
dc.date.available2025-12-28T16:40:23Z
dc.date.issued2025
dc.identifier.issn1472-6831
dc.identifier.urihttps://doi.org/10.1186/s12903-025-05926-2
dc.identifier.urihttps://hdl.handle.net/20.500.12933/2546
dc.description.abstractBackground Artificial intelligence (AI) has rapidly advanced in healthcare and dental education, significantly impacting diagnostic processes, treatment planning, and academic training. The aim of this study is to evaluate the performance differences between different large language models (LLMs) by analyzing their accuracy rates in answers to multiple choice oral pathology questions. Methods This study evaluates the performance of eight LLMs (Gemini 1.5, Gemini 2, ChatGPT 4o, ChatGPT 4, ChatGPT o1, Copilot, Claude 3.5, Deepseek) in answering multiple-choice oral pathology questions from the Turkish Dental Specialization Examination (DUS). A total of 100 questions from 2012 to 2021 were analyzed. Questions were classified as case-based or knowledge-based. The responses were classified as correct or incorrect based on official answer keys. To prevent learning biases, no follow-up questions or feedback were provided after the LLMs' responses. Results Significant performance differences were observed among the models (p < 0.001). ChatGPT o1 achieved the highest accuracy (96 correct, 4 incorrect), followed by Claude (84 correct), Gemini 2 and Deepseek (82 correct each). Copilot had the lowest performance (61 correct). Case-based questions showed notable performance variations (p = 0.034), where ChatGPT o1 and Claude excelled. For knowledge-based questions, ChatGPT o1 and Deepseek demonstrated the highest accuracy (p < 0.001). Post-hoc analysis revealed that ChatGPT o1 performed significantly better than most other models across both case-based and knowledge-based questions (p < 0.0031). Conclusion LLMs demonstrated variable proficiency in oral pathology questions, with ChatGPT o1 showing higher accuracy. LLMs shows promise as a supplementary educational tool, though further validation is required.
dc.language.isoen
dc.publisherBmc
dc.relation.ispartofBmc Oral Health
dc.rightsinfo:eu-repo/semantics/openAccess
dc.subjectArtificial intelligence
dc.subjectOral pathology
dc.subjectLarge language models
dc.titleArtificial intelligence performance in answering multiple-choice oral pathology questions: a comparative analysis
dc.typeArticle
dc.identifier.orcid0000-0001-5327-1953
dc.identifier.orcid0000-0002-3123-5190
dc.identifier.orcid0000-0002-1336-4239
dc.departmentAfyonkarahisar Sağlık Bilimleri Üniversitesi
dc.identifier.doi10.1186/s12903-025-05926-2
dc.identifier.volume25
dc.identifier.issue1
dc.relation.publicationcategoryMakale - Uluslararası Hakemli Dergi - Kurum Öğretim Elemanı
dc.department-temp[Yilmaz, Birkan Eyup] Giresun Univ, Fac Dent, Dept Oral & Maxillofacial Surg, Giresun, Turkiye; [Yilmaz, Busra Nur Gokkurt] Giresun Oral & Dent Hlth Ctr, Dept Dentomaxillofacial Radiol, Giresun, Turkiye; [Ozbey, Furkan] Afyonkarahisar Hlth Sci Univ, Fac Dent, Dept Dentomaxillofacial Radiol, Afyonkarahisar, Turkiye
dc.identifier.pmid40234873
dc.identifier.scopus2-s2.0-105002770899
dc.identifier.scopusqualityQ2
dc.identifier.wosWOS:001468436100006
dc.identifier.wosqualityN/A
dc.indekslendigikaynakWeb of Science
dc.indekslendigikaynakScopus
dc.indekslendigikaynakPubMed
dc.snmzKA_WoS_20251227


Bu öğenin dosyaları:

DosyalarBoyutBiçimGöster

Bu öğe ile ilişkili dosya yok.

Bu öğe aşağıdaki koleksiyon(lar)da görünmektedir.

Basit öğe kaydını göster