Mapping DSEd Lit in HDSR

Wilkerson, M. H. (2025). Mapping the Conceptual Foundation(s) of “Data Science Education”. Harvard Data Science Review, 7(3). doi:10.1162/99608f92.9ac68105

The emerging field of data science education is broadly interdisciplinary, and related literature is distributed across a variety of outlets. Navigating this landscape can be challenging, especially for those who are just entering the field or who seek insights from multiple disciplinary communities. This report contributes to a growing body of research that aims to characterize the emerging field of ‘data science education.’ It presents a reference co-citation analysis drawing from 7,000-plus works that are cited by papers that explicitly identify themselves in their title, keywords, or abstract as concerned with ‘data science education’—thus representing the de facto foundations of this field.

The structure of the resulting reference co-citation network suggests that while there are some ‘broker’ references that are more broadly cited across the emerging data science education literature, the field is building upon three distinct, conceptually coherent, and rather isolated clusters of literature. These clusters generally focus on undergraduate data science programs, K–12 data science, and computational approaches to data for nonmajors, respectively. I characterize each cluster with attention to the audiences, themes, pedagogies, and methodologies emphasized, and explore the nature of recent data science education papers that draw heavily from each cluster. All three clusters demonstrate attention to student-centered pedagogies, research methodologies that highlight student experience, and ethics and diversity. However, they also evidence important differences in scholarly foundations between undergraduate and K–12 data science education efforts, between data science education efforts for majors versus nonmajors, and between K–12 data science initiatives emerging from different groups and disciplines.


People