SGIM Forum

Medical Education

Basic Science for the 21st  Century—Data and Population Health Management 

Dr. Lipschitz ( is an assistant professor in the Division of General Internal Medicine, University of Arkansas for Medical Sciences. Dr. Clemmons ( is an associate professor of Medical Humanities and Bioethics and the Assistant Dean for Medical Education, University of Arkansas for Medical Sciences. Dr. Alfanek ( is a PGY-2 resident in Internal Medicine, Brown University School of Medicine. Dr. Yun ( is a clinical assistant professor in the Division of General Internal Medicine, University of Michigan Medical School.


As health systems embrace value-based care, future physicians must develop a new set of skills to provide equitable, effective, and population-based medicine. Learning how to work with datasets and the “denominator” of patient populations is crucial to achieve most metrics of high value care. While medical education organizations1, 2 consistently emphasize the need to incorporate population health into curricula and utilize data driven approaches to population health management,3 there is a gap between students conceptual understanding of population health and learners’ mastery of the practical applications of population or database management. Without an applied understanding of population-based medicine, learners can perceive these issues as peripheral to future practice and divorced from clinical and basic science learning.4, 5

Data and population health management, the new “basic science for the 21st century,” should be integrated into undergraduate medical education. It is essential for physicians to develop skills in simple manipulation of large datasets to understand how to risk stratify patients, evaluate drivers of cost, and define targeted groups for interventions. From 2019 to 2022, colleagues from the University of Michigan Medical School (UMMS) and the University of Arkansas for Medical Sciences (UAMS) developed and piloted a targeted curriculum for population health and database management. This educational innovation offers a meaningful bridge from population health theory to high value, clinical practice. 

Goals and Objectives

Through a cross-institutional collaboration, we developed the Practical Population Health (PPH) Chronic Disease Dataset—a fictitious dataset of 10,000 patients with chronic illness. Coupled with a curriculum to teach concepts in population health, this dataset was developed to facilitate medical students’ mastery of applied skills in database manipulation and population health management. The PPH curriculum guides medical students from analyzing a large dataset of patient information to identifying a specific cohort of high cost/high need patients, evaluating systemic and individual barriers to care, and identifying interventions for more equitable, high value care. Student learning outcomes were assessed using pre- and post-course surveys and qualitative course evaluations.


The PPH curriculum and PPH Chronic Disease Dataset were piloted at two medical schools from 2019-22 to give students a framework to integrate and apply concepts of data and population health management. At the University of Michigan Medical School, the curriculum was offered to third-year post-clerkship medical students as part of a two-week course titled “Introduction to Patients and Populations.” At the University of Arkansas for Medical Sciences, the curriculum was embedded into a twelve-week longitudinal fourth-year medical student elective titled “Population Health, Health Equity and Care for High-Risk Patients.”

Student were first introduced to core concepts of population health and then provided with the PPH Chronic Disease Dataset, which included demographic data, cost of care, health system utilization, chronic disease diagnoses, and specific socioeconomic metrics. Using Excel, students learned the basic skills of creating pivot tables, sorting data with filters, and segmenting patient groups. Students risk stratified the population based on different drivers of risk, including cost, hospital utilization and number of chronic illnesses, to learn how to manipulate data and identify cohorts for intervention. After using the dataset to identify high-risk cohorts, students were provided with a case-based presentation of high cost/high need patients to demonstrate how the process of identifying and sub-segmenting a population could then be used to apply different evidence-based interventions.

Over three years, the curriculum was modified in an iterative fashion based on lessons learned from previous courses. The data management component ultimately incorporated the following four parts: 

  1. A short didactic lecture entitled “Data for Non-Data Doctors,”
  2. An instructional video teaching students basic Excel skills,
  3. A large group team-based learning activity to evaluate trends in the data,
  4. And at UAMS a final project with student-proposed interventions to improve a gap in the population.


Because the data component of our curriculum evolved over the three years, we have focused on qualitative student comments to guide our results. Student comments from both institutions on open-ended feedback questions were analyzed using a thematic analysis approach.5 Themes that emerged included strong relevance of the material, new concepts not previously studied in medical school, and the value of hands-on data analysis. Representative student comments include the following:

  • “This was very helpful for understanding data close to home and visualizing real health equity data.”
  • “We get very little training on data interpretation/analysis during med school and learning tricks and tools on excel [sic] that we can take with us going forward will be very beneficial.”
  • “This was a very useful experience in trying to utilize statistical modeling to better understand how to use and visualize a dataset in context of patient and population.”


The PPH Chronic Disease Dataset and complementary curriculum gives medical students a comprehensive framework to understand data and population health management—introducing the role of data for future physicians, teaching skills to manipulate a database, and evaluating evidence-based programs for targeted patient groups. The PPH Curriculum emphasizes real-world application of population health concepts, connecting theory with practice. Qualitative feedback emphasized that students valued learning about population health when coursework demonstrated a clear connection to clinical patient care. After working with the PPH Chronic Disease Dataset, many students felt that they could “see themselves” incorporating data and population health into their future careers. The feedback we received highlights the importance of creating real-world and applied opportunities for students to practice manipulating and utilizing population level data in medical education.

Our curriculum and the analysis have limitations, namely small student numbers, variation between institutions, and lack of rigorous quantitative evaluations due to the iterative nature of the curriculum over three years. However, there are clear opportunities to expand this curricular model. Future plans include integrating chronic disease registries from the electronic medical record for data evaluation as well as developing more simulated “playgrounds” for student learning.

Despite its limitations, the PPH Chronic Disease Dataset stands alone as a curricular innovation with great opportunities for expansion and wide-ranging application in medical education. This dataset simulates real-world population level data and can be used to teach a variety of skills to students. Beyond our PPH Curriculum, continued refinement and expansion of the dataset and a data playground could be used for a variety of applications in data analysis in medical education. We feel there is a need for further development of data resources tailored for medical student use to help teach data and population management.


  1. Byrne LM, Nasca TJ. Population health and graduate medical education: Updates to the ACGME’s common program requirements. J Grad Med Educ. 2019; 11(3): 357-361.
  2. Murphy B. New approach equips med school grads for tomorrow’s health system. AMA. Published June 5, 2018. Accessed February 15, 2023.
  3. Gonzalo JD, Davis C, Thompson BM, et al. Unpacking medical students’ mixed engagement in health systems science education. Teach Learn Med. 2020; 32(3):250-258.
  4. Gonzalo JD, Ogrinc G. Health systems science: The “broccoli” of undergraduate medical education. Acad Med. 2019;94(10):1425-1432.
  5. Braun V, Clarke V. Using thematic analysis in psychology. Qual Res Psychol. 2006; 3(2):77-101.


Tags and Keywords