AI in Medical Education - Everything, everywhere, all at once? (Part 2/3)
- Michael Co
- Dec 15, 2023
- 4 min read
Updated: Dec 22, 2023

Michael Co, Chief Medical Advisor, BME Innovation
ChatGPT for generating undergraduate medical assessment questions
The first project that I could think of at the very initial introduction of GenAI is, whether or not it could generate good multiple choice questions (MCQs) for undergraduate medical examinations. Quality of MCQ is generally determined by several factors, including clear stem, good lead-in, quality distractors and the ability to pass the hand-cover tests. To us, drafting MCQs for written exams is truly labour intensive. As such, I have decided to collaborate with some of my friends, both locally and internationally, for a project evaluating the capability of ChatGPT in generating medical exam MCQs. This project involves inputs from clinical teachers from the University of Hong Kong, University of Edinburgh, National University of Singapore and University of Galway in Ireland.
In that study, 50 MCQs were generated by ChatGPT with reference to two standard undergraduate medical textbooks (Harrison’s, and Bailey & Love’s). Another 50 MCQs were drafted by two university professoriate (Surgery and Medicine) staffs using the same medical textbooks. All 100 MCQs were individually numbered, randomized and sent to five independent international assessors for MCQ quality assessment.
We found that the total time required for ChatGPT to create the 50 questions was 20 minutes 25 seconds while it took two human examiners (Dr. Gary Lau from Medicine HKU, and me) a total of 211 minutes 33 seconds for drafting the 50 questions.
When a comparison of the mean score was made between the questions constructed by AI with those drafted by human, only in the relevance domain that the AI was inferior to human (AI: 7.56 +/- 0.94 vs human: 7.88 +/- 0.52; p = 0.04). There was no significant difference in question quality between questions drafted by AI versus human.
Our study concluded that Generative AI is capable in generating good quality MCQs for undergraduate medical examinations, results were published on PLOS One (2023).
Virtual surgical bedside teaching using Generative AI Chatbot
The idea of using chatbot in bedside teaching first appeared when I was dealing with complicated flight cancellations and refunds with an airline chatbot during the COVID-19 pandemic. As an end user of a customer service chatbot, I won’t deny that I was annoyed and frustrated from time to time for not being able to get the desired answers from it, especially when monotonous answers or generic messages keep popping up onto my screen.
With this in mind, I have collaborated with John Yuen, an IT expert, since 2021 in developing a chatbot for virtual bedside teaching. The first chatbot was developed in 2021. This chatbot served as our virtual patient during bedside teaching. Students were asked to take history from the chatbot and the chatbot answers with predefined responses. This new teaching method was evaluated by a prospective case-control study. Students were assigned into bedside teaching with chatbot or with conventional bedside teachings, taking history from genuine patients. Our study was published on Heliyon in 2022 and concluded that the history taking skills of students taught by chatbot virtual bedside teaching were not inferior to those taught by conventional physical method (2).
Our students were generally happy with the bedside teachings with the chatbot as their virtual patient. However, many of them reflected that the answers from the chatbot was monotonous, with quite a lot of generic responses like “I do not understand your question, could you rephrase it?”. This is what exactly I have encountered with the airline customer service chatbots during the COVID-19 pandemic.
Over the past 12 months, further improvements were made on our chatbot. The second generation chatbot was developed with John again in 2023. This chatbot simulates real world hospital settings with different wards and bed numbers. We called this chatbot app – Bennie and the Chats 2.0. The first product of BME Innovation / BME Systems.
For bedside teachings, medical students were directed to their designated virtual wards and virtual patient for clinical history taking. It took us more than a year for developing this AI chatbot.
There are 10 different departments in this virtual hospital, and within The Department of Surgery, we have designed individual specialty wards such as surgical gastrointestinal ward, breast surgery ward, hepatobiliary surgery ward, and burn unit, etc.
Clinical teachers are invited to design their own virtual patients on a designated case creation platform, the cases could be simple or complex based on student’s level of learning. Patient details such as clinical symptoms, background medical information, and personality, are entered into the chatbot administrative page. The chatbot will then interact with the students according to the pre-defined information. In the case creation platform, clinical teachers can also edit or delete the case scenarios when necessary.
A pilot run of the virtual surgical bedside teaching (Breast surgery) was started in August 2023 on final year medical students. Students were asked to take clinical history from the chatbot (Mobile app or webpage) for 30 minutes in a tutorial room. This is followed by a demonstration of the clinical signs on a manikin by the clinical teacher. There were also group discussions of patient’s clinical photos, mammogram and ultrasound images, biopsy reports and subsequent surgical management of the virtual patient.
The initial experience was successful without any major technical issues reported. When compared with the 1st generation chatbot, the generative AI component of this chatbot allows better interaction with students. It is even capable of having a short chitchat with the students, this resembles the real life dialogue between patients and health care professionals.
One obvious advantage of using chatbot in virtual bedside teachings is that clinical teachers could design their own desired virtual patient, tailor-make case scenarios that fit into students’ learning needs.
To be continued ...
Comments