Is ChatGPT better on the Plastic Surgery Inservice than doctors? Journal time!

Posted on January 19, 2024

Quick blog, but this made me smile.

There were two studies in the latest Aesthetic Surgery Journal December 2023, “Performance of ChatGPT on the Plastic Surgery Inservice Training Examination” and “ChatGPT is Equivalent to First Year Plastic Surgery Residents.” 

These were actually two different studies.

The first looked at the 2022 Plastic Surgery In Service Exam and had ChatGPT take the exam (omitting questions with tables or images). Responses that were incorrect were divided into logic, information, or explicit fallacy. It answered 242 questions with accuracy of 55%. It had a statistically significant difference in use of external information. It used logical reasoning in 89% of the questions, internal information in 95.5% of the questions, and external information in 92% of questions.

The second study looked at In Service Exams from 2018 to 2022.  This had 1129 questions.  ChatGPT answered 55.8% correctly.  This correlated to a first year integrated plastic surgery resident 49th percentile.  (And only 5th percentile for 3/4th year residents, and 0th percentile for 5th/6th year).

My thoughts?

PEOPLE FOR THE WIN! I love that humans are still better than AI for these tests.  It was interesting to see the accuracy being so low, and the first test seems to attribute part of it to its use of external information.

That wouldn’t surprise me. One of the biggest issues of the internet is the quality of information. Who do you believe? What is fake news? Without an educated filter (No, breasts do not weight 15 pounds even though you saw a TikTok which said they do) you may get things wrong. For something as important as your health and safety in medicine, you still need a doctor. AI will likely get there, but not yet.