Division
East Florida
Hospital
HCA Florida Westside Hospital
Specialty
Pathology
Document Type
Poster
Publication Date
2025
Keywords
artificial intelligence, AI, large language models, LLM, pathology, diagnosis
Disciplines
Diagnosis | Pathology
Abstract
Introduction: In recent years, artificial intelligence tools, such as large language models (LLMs) have expanded the potential for diagnostic medicine, including histopathology. This study aims to evaluate the diagnostic ability and utility of the publicly available large language models in predicting the accurate diagnosis of the unknown cases by using the images of the hematoxylin-eosin stained slides taken by a mobile phone and compare their performance with the residents’ performance.
Method: The twenty cases, including a variety of entities, were collected from teaching sets of non-HCA patients and public available domains, which are used for unknown slide sessions for residents. Three publicly available LLMs, Chat-GPT 4.0, Claude 3.5 Sonnet, and Gemini 1.5 Flash were used for generating the diagnosis of these H-E slide histology images, using a standard prompt. The same cases were evaluated blindly by four residents. The accuracy of the three LLMs were compared with each other and with the accuracy rate of the residents.
Results: The most accurate LLM was the Claude with an accuracy rate of 50%, followed by Gemini (40%) and Chat-GPT (35%). The highest accuracy rate of the LLMs (50%) was lower than the lowest accuracy rate of the residents (55%). The average accuracy rate of the LLMs was 41.66 % versus 67.5 % for residents.
Conclusions: The current LLMs are not sufficient for diagnostic use, and need to be improved for better diagnostic accuracy.
Original Publisher
HCA Healthcare Graduate Medical Education
Recommended Citation
Wymer, Gul Emek and Ferra, Susana, "Can Residents Cheat the Unknown Slide Sessions by Using the Large Language Models? Which Model Should They Use: Chat-Gpt, Claude, Or Gemini?" (2025). East Florida Division GME Research Day 2025. 4.
https://scholarlycommons.hcahealthcare.com/eastflorida2025/4
Abstract