Canadian perspective on teaching artificial intelligence to medical students

Thank you for visiting Nature.com. The version of browser you are using has limited CSS support. For best results, we recommend using a newer version of your browser (or turning off compatibility mode in Internet Explorer). In the meantime, to ensure ongoing support, we are showing the site without styling or JavaScript.
Applications of clinical artificial intelligence (AI) are growing rapidly, but existing medical school curricula offer limited teaching covering this area. Here we describe an artificial intelligence training course we developed and delivered to Canadian medical students and make recommendations for future training.
Artificial intelligence (AI) in medicine can improve workplace efficiency and aid clinical decision making. To safely guide the use of artificial intelligence, physicians must have some understanding of artificial intelligence. Many comments advocate teaching AI concepts1, such as explaining AI models and verification processes2. However, few structured plans have been implemented, especially at the national level. Pinto dos Santos et al.3. 263 medical students were surveyed and 71% agreed that they needed training in artificial intelligence. Teaching artificial intelligence to a medical audience requires careful design that combines technical and non-technical concepts for students who often have extensive prior knowledge. We describe our experience delivering a series of AI workshops to three groups of medical students and make recommendations for future medical education in AI.
Our five-week Introduction to Artificial Intelligence in Medicine workshop for medical students was held three times between February 2019 and April 2021. A schedule for each workshop, with a brief description of changes to the course, is shown in Figure 1. Our course has three primary learning objectives: students understand how data is processed in artificial intelligence applications, analyze the artificial intelligence literature for clinical applications, and take advantage of opportunities to collaborate with engineers developing artificial intelligence.
Blue is the topic of the lecture and light blue is the interactive question and answer period. The gray section is the focus of the brief literature review. The orange sections are selected case studies that describe artificial intelligence models or techniques. Green is a guided programming course designed to teach artificial intelligence to solve clinical problems and evaluate models. The content and duration of the workshops vary based on an assessment of student needs.
The first workshop was held at the University of British Columbia from February to April 2019, and all 8 participants gave positive feedback4. Due to COVID-19, the second workshop was held virtually in October-November 2020, with 222 medical students and 3 residents from 8 Canadian medical schools registering. Presentation slides and code have been uploaded to an open access site (http://ubcaimed.github.io). The key feedback from the first iteration was that the lectures were too intense and the material too theoretical. Serving Canada’s six different time zones poses additional challenges. Thus, the second workshop shortened each session to 1 hour, simplified the course material, added more case studies, and created boilerplate programs that allowed participants to complete code snippets with minimal debugging (Box 1). Key feedback from the second iteration included positive feedback on the programming exercises and a request to demonstrate planning for a machine learning project. Therefore, in our third workshop, held virtually for 126 medical students in March-April 2021, we included more interactive coding exercises and project feedback sessions to demonstrate the impact of using workshop concepts on projects.
Data Analysis: A field of study in statistics that identifies meaningful patterns in data by analyzing, processing, and communicating data patterns.
Data mining: the process of identifying and extracting data. In the context of artificial intelligence, this is often large, with multiple variables for each sample.
Dimensionality reduction: The process of transforming data with many individual features into fewer features while preserving the important properties of the original data set.
Characteristics (in the context of artificial intelligence): measurable properties of a sample. Often used interchangeably with “property” or “variable”.
Gradient Activation Map: A technique used to interpret artificial intelligence models (especially convolutional neural networks), which analyzes the process of optimizing the last part of the network to identify regions of data or images that are highly predictive.
Standard Model: An existing AI model that has been pre-trained to perform similar tasks.
Testing (in the context of artificial intelligence): observing how a model performs a task using data it has not encountered before.
Training (in the context of artificial intelligence): Providing a model with data and results so that the model adjusts its internal parameters to optimize its ability to perform tasks using new data.
Vector: array of data. In machine learning, each array element is usually a unique feature of the sample.
Table 1 lists the latest courses for April 2021, including targeted learning objectives for each topic. This workshop is intended for those new to the technical level and does not require any mathematical knowledge beyond the first year of an undergraduate medical degree. The course was developed by 6 medical students and 3 teachers with advanced degrees in engineering. Engineers are developing artificial intelligence theory to teach, and medical students are learning clinically relevant material.
Workshops include lectures, case studies, and guided programming. In the first lecture, we review selected concepts of data analysis in biostatistics, including data visualization, logistic regression, and the comparison of descriptive and inductive statistics. Although data analysis is the foundation of artificial intelligence, we exclude topics such as data mining, significance testing, or interactive visualization. This was due to time constraints and also because some undergraduate students had prior training in biostatistics and wanted to cover more unique machine learning topics. The subsequent lecture introduces modern methods and discusses AI problem formulation, advantages and limitations of AI models, and model testing. The lectures are complemented by literature and practical research on existing artificial intelligence devices. We emphasize the skills required to evaluate the effectiveness and feasibility of a model to address clinical questions, including understanding the limitations of existing artificial intelligence devices. For example, we asked students to interpret the pediatric head injury guidelines proposed by Kupperman et al., 5 which implemented an artificial intelligence decision tree algorithm to determine whether a CT scan would be useful based on a physician’s examination. We emphasize that this is a common example of AI providing predictive analytics for physicians to interpret, rather than replacing physicians.
In the available open source bootstrap programming examples (https://github.com/ubcaimed/ubcaimed.github.io/tree/master/programming_examples), we demonstrate how to perform exploratory data analysis, dimensionality reduction, standard model loading, and training . and testing. We use Google Colaboratory notebooks (Google LLC, Mountain View, CA), which allow Python code to be executed from a web browser. In Fig. Figure 2 provides an example of a programming exercise. This exercise involves predicting malignancies using the Wisconsin Open Breast Imaging Dataset6 and a decision tree algorithm.
Present programs throughout the week on related topics and select examples from published AI applications. Programming elements are only included if they are considered relevant to providing insight into future clinical practice, such as how to evaluate models to determine whether they are ready for use in clinical trials. These examples culminate in a full-fledged end-to-end application that classifies tumors as benign or malignant based on medical image parameters.
Heterogeneity of prior knowledge. Our participants varied in their level of mathematical knowledge. For example, students with advanced engineering backgrounds are looking for more in-depth material, such as how to perform their own Fourier transforms. However, discussing the Fourier algorithm in class is not possible because it requires in-depth knowledge of signal processing.
Attendance outflow. Attendance at follow-up meetings declined, especially in online formats. A solution may be to track attendance and provide a certificate of completion. Medical schools are known to recognize transcripts of students’ extracurricular academic activities, which can encourage students to pursue a degree.
Course Design: Because AI spans so many subfields, selecting core concepts of appropriate depth and breadth can be challenging. For example, the continuity of use of AI tools from the laboratory to the clinic is an important topic. While we cover data preprocessing, model building, and validation, we do not include topics such as big data analytics, interactive visualization, or conducting AI clinical trials, instead we focus on the most unique AI concepts. Our guiding principle is to improve literacy, not skills. For example, understanding how a model processes input features is important for interpretability. One way to do this is to use gradient activation maps, which can visualize which regions of the data are predictable. However, this requires multivariate calculus and cannot be introduced8. Developing a common terminology was challenging because we were trying to explain how to work with data as vectors without mathematical formalism. Note that different terms have the same meaning, for example, in epidemiology, a “characteristic” is described as a “variable” or “attribute.”
Knowledge retention. Because the application of AI is limited, the extent to which participants retain knowledge remains to be seen. Medical school curricula often rely on spaced repetition to reinforce knowledge during practical rotations,9 which can also be applied to AI education.
Professionalism is more important than literacy. The depth of the material is designed without mathematical rigor, which was a problem when launching clinical courses in artificial intelligence. In the programming examples, we use a template program that allows participants to fill out fields and run the software without having to figure out how to set up a complete programming environment.
Concerns about artificial intelligence addressed: There is widespread concern that artificial intelligence could replace some clinical duties3. To address this issue, we explain the limitations of AI, including the fact that almost all AI technologies approved by regulators require physician supervision11. We also emphasize the importance of bias because algorithms are prone to bias, especially if the data set is not diverse12. Consequently, a certain subgroup may be modeled incorrectly, leading to unfair clinical decisions.
Resources are publicly available: We have created publicly available resources, including lecture slides and code. Although access to synchronous content is limited due to time zones, open source content is a convenient method for asynchronous learning since AI expertise is not available at all medical schools.
Interdisciplinary Collaboration: This workshop is a joint venture initiated by medical students to plan courses together with engineers. This demonstrates collaboration opportunities and knowledge gaps in both areas, allowing participants to understand the potential role they can contribute in the future.
Define AI core competencies. Defining a list of competencies provides a standardized structure that can be integrated into existing competency-based medical curricula. This workshop currently uses Learning Objective Levels 2 (Comprehension), 3 (Application), and 4 (Analysis) of Bloom’s Taxonomy. Having resources at higher levels of classification, such as creating projects, can further strengthen knowledge. This requires working with clinical experts to determine how AI topics can be applied to clinical workflows and preventing the teaching of repetitive topics already included in standard medical curricula.
Create case studies using AI. Similar to clinical examples, case-based learning can reinforce abstract concepts by highlighting their relevance to clinical questions. For example, one workshop study analyzed Google’s AI-based diabetic retinopathy detection system 13 to identify challenges along the path from lab to clinic, such as external validation requirements and regulatory approval pathways.
Use experiential learning: Technical skills require focused practice and repeated application to master, similar to the rotating learning experiences of clinical trainees. One potential solution is the flipped classroom model, which has been reported to improve knowledge retention in engineering education14. In this model, students review theoretical material independently and class time is devoted to solving problems through case studies.
Scaling for multidisciplinary participants: We envision AI adoption involving collaboration across multiple disciplines, including physicians and allied health professionals with varying levels of training. Therefore, curricula may need to be developed in consultation with faculty from different departments to tailor their content to different areas of health care.
Artificial intelligence is high-tech and its core concepts are related to mathematics and computer science. Training healthcare personnel to understand artificial intelligence presents unique challenges in content selection, clinical relevance, and delivery methods. We hope that the insights gained from the AI in Education workshops will help future educators embrace innovative ways to integrate AI into medical education.
The Google Colaboratory Python script is open source and available at: https://github.com/ubcaimed/ubcaimed.github.io/tree/master/.
Prober, K.G. and Khan, S. Rethinking medical education: a call to action. Akkad. medicine. 88, 1407–1410 (2013).
McCoy, L.G. etc. What do medical students really need to know about artificial intelligence? NPZh numbers. Medicine 3, 1–3 (2020).
Dos Santos, DP, et al. Medical students’ attitudes toward artificial intelligence: a multicenter survey. EURO. radiation. 29, 1640–1646 (2019).
Fan, K. Y., Hu, R., and Singla, R. Introduction to machine learning for medical students: a pilot project. J. Med. teach. 54, 1042–1043 (2020).
Cooperman N, et al. Identifying children at very low risk of clinically significant brain injury after head injury: a prospective cohort study. Lancet 374, 1160–1170 (2009).
Street, WN, Wolberg, WH and Mangasarian, OL. Nuclear feature extraction for breast tumor diagnosis. Biomedical Science. Image processing. Biomedical Science. Weiss. 1905, 861–870 (1993).
Chen, P. H. C., Liu, Y. and Peng, L. How to develop machine learning models for healthcare. Nat. Matt. 18, 410–414 (2019).
Selvaraju, R.R. et al. Grad-cam: Visual interpretation of deep networks via gradient-based localization. Proceedings of the IEEE International Conference on Computer Vision, 618–626 (2017).
Kumaravel B, Stewart K and Ilic D. Development and evaluation of a spiral model for assessing evidence-based medicine competencies using OSCE in undergraduate medical education. BMK Medicine. teach. 21, 1–9 (2021).
Kolachalama V.B. and Garg P.S. Machine learning and medical education. NPZh numbers. medicine. 1, 1–3 (2018).
van Leeuwen, K.G., Schalekamp, S., Rutten, M.J., van Ginneken, B. and de Rooy, M. Artificial intelligence in radiology: 100 commercial products and their scientific evidence. EURO. radiation. 31, 3797–3804 (2021).
Topol, E.J. High-performance medicine: the convergence of human and artificial intelligence. Nat. medicine. 25, 44–56 (2019).
Bede, E. et al. Human-centered evaluation of a deep learning system deployed in the clinic for the detection of diabetic retinopathy. Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems (2020).
Kerr, B. The flipped classroom in engineering education: A research review. Proceedings of the 2015 International Conference on Interactive Collaborative Learning (2015).
The authors thank Danielle Walker, Tim Salcudin, and Peter Zandstra from the Biomedical Imaging and Artificial Intelligence Research Cluster at the University of British Columbia for support and funding.
RH, PP, ZH, RS and MA were responsible for developing the workshop teaching content. RH and PP were responsible for developing the programming examples. KYF, OY, MT and PW were responsible for the logistical organization of the project and the analysis of the workshops. RH, OY, MT, RS were responsible for creating the figures and tables. RH, KYF, PP, ZH, OY, MY, PW, TL, MA, RS were responsible for drafting and editing the document.
Communication Medicine thanks Carolyn McGregor, Fabio Moraes, and Aditya Borakati for their contributions to the review of this work.

Post time: Feb-19-2024