Doctors are increasingly turning to electronic records for information to guide them on the best treatment for their patients. To help them do this more quickly, researchers from MIT, the MIT-IBM Watson AI Lab, IBM Research, physicians and medical experts, have developed machine learningmodels. Their study “Learning to ask like a doctor” was published in Arxiv on June 6. The research paper will be presented at the annual conference of the North American chapter of the Association for Computational Linguistics.
Physicians often query electronic health records (EHRs) to make informed decisions about patient care. However, a 2004 study found that it took them an average of 8.3 minutes to find an answer to a single question, despite being trained to use EHRs, leaving them less time to interact with their patients.
Existing models struggle to generate relevant questions; those asked by a physician do not get the right answers most of the time.
Researchers have begun to develop machine learning models that can streamline the process by automatically finding the information physicians need in an EHR. While these models must be trained on huge datasets of relevant medical questions to be effective, they have run into the problem of personal data privacy.
Overcoming the medical data shortage
To overcome the data shortage, the researchers worked with 10 medical experts of different skill levels to compile DiSCQ, a new dataset of more than 2,000 questions.
Eric Lehman, the study’s lead author and a graduate student in the Computer Science and Artificial Intelligence Laboratory (CSAIL), explains:
“Two thousand questions may seem like a lot, but when you look at the machine learning models that are trained today, they have so much data, maybe billions of data points. When you train machine learning models to work in healthcare settings, you have to be really creative because there is such a lack of data.”
To build DiSCQ, the MIT researchers asked these experts (practicing physicians and medical students in their final year of training) to read 100 EHR abstracts and ask any questions that came to mind. They had placed no restrictions on question types or structures in order to collect natural questions. As they had anticipated, most questions were about the patient’s symptoms, treatments, or test results.
They also asked the medical experts to identify the “trigger text” in the EHR that led them to ask each question. For example, when a note in an EHR indicates that a patient’s medical history is important for prostate cancer, then the trigger text is “prostate cancer,” it may lead the expert to ask questions such as “date of diagnosis?” or “any procedures performed?”
The researchers then compiled their set of questions and the accompanying trigger text, they used it to train machine learning models to ask new questions based on the trigger text.
Then, medical experts determined whether these questions were “good” by measuring their understandability, triviality, medical relevance, and relevance to the trigger (is the trigger related to the question?).
They were able to find that when a model received a text trigger, it was able to generate a good question 63% of the time, while a human doctor asked a good question 80% of the time.
One of the lead authors, Peter Szolovits, a professor in the Department of Electrical Engineering and Computer Science (EECS) who leads CSAIL’s clinical decision making group, also a member of the MIT-IBM Watson AI Lab, explains:
“Realistic data is essential for training models that are relevant to the task but difficult to find or create. The value of this work lies in the careful collection of questions asked by clinicians about patient cases, from which we are able to develop methods that use this data and general language models to ask other plausible questions.”
On the relevance of the data
On the other hand, the researchers trained models using the publicly available datasets they found at the beginning of this project. These models were only able to recover about 25% of the answers to the physician-generated questions.
Eric Lehman states:
“This result is really concerning. What people thought were successful models were, in practice, just horrible because the assessment questions they were testing on were not good to begin with.”
The team is now applying this work to its goal: building a model that can automatically answer physicians’ questions in an EHR.
There is still a long way to go before this model becomes a reality, yet Eric Lehman says he is encouraged by the team’s strong initial results.
Article sources:
“Learning to Ask Like a Physician” arXiv:2206.02696v1, 6 June 2022.
Authors: Eric Lehman, Vladislav Lialin, Katelyn Y. Legaspi, Anne Janelle R. Sy, Patricia Therese S. Pile, Nicole Rose I. Alberto, Richard Raymund, R. Ragasa, Corinna Victoria M. Puyat, Isabelle Rose I. Alberto, Pia Gabrielle I. Alfonso, Marianne Taliño, Dana Moukheiber, Byron C. Wallace, Anna Rumshisky, Jenifer J. Liang, Preethi Raghavan, Leo Anthony Celi, Peter Szolovits
Translated from Des chercheurs développent une IA pour aider les médecins à trouver les informations pertinentes au sein des dossiers électroniques