In April 2020 I created and delivered a talk for the Committee on Professional Ethics of the American Statistical Association, “Writing a case to teach ethical statistical practice with the ASA Ethical Guidelines“. I used this method for the 7 cases included in my first book, “Ethical Reasoning for a Data-Centered World” and for the 47 cases included in the second book, “Ethical Practice of Statistics and Data Science” (both published in 2022). This presentation leaned into the purposeful introduction of cases for teaching ethical reasoning and the effective use of ethical guidelines like the ASA’s Ethical Guidelines for Statistical Practice and in fact I pointed out that the purpose of the presentation was to facilitate the creation of “learning-outcomes centered cases for use in the teaching of ethical statistical practice and the ASA Ethical Guidelines”. While cases are often used to “teach ethics”, I do not believe the typical use of cases achieves this aim. It is also difficult to assess student work on case analyses unless students are also taught how to execute a case analysis, and this is not often done, particularly in ‘teaching responsible conduct of research’.
In this 2020 presentation I also introduced the Statistics and Data Science Pipeline, which describes seven tasks that are recognizable to any practitioner of statistics and data science:
1. Plan/design;
2. Collect/munge/wrangle data;
3. Analysis โ literal for statistics & data science, โevaluationโ for computing;
4. Interpretation โ always for statistics & data science, never for computing;
5. Documentation;
6. Report & communicate;
7. Work on a team.
These tasks offer the instructor (and case developer) multiple advantages over writing cases without this structure. Firstly, by creating cases particular to each of these tasks, the instructor can demonstrate and have students literally engage with the idea that ethical decision making happens throughout any statistical project – not just at the start or end of a project. Moreover, statistics practitioners may become involved at one or another of these tasks, and so they have an obligation to carry out each task ethically and the ASA Ethical Guidelines actually offer extensive support for each of the tasks. It is also clear from the Pipeline that “ethical statistical practice” goes far beyond just the analysis of data; and some students may be surprised to learn that it is “ethical statistical practice” to communicate effectively with all stakeholders. Finally, by leveraging the Statistics and Data Science Pipeline, instructors can do two important things: 1) be assured that the ethical statistics training they provide is actually relevant to the specific tasks involved in statistical practice; and 2) feel confident that the ethics training they are providing is actually relevant for modern scientific practice, which is dependent on statistics and data science.
In the five years since developing the Statistics and Data Science Pipeline, I have updated it to be consistent with the Generic Statistical Business Process Model (GSBPM, UN 2019). The GSBPM describes and defines the set of business processes needed to produce official statistics and was developed by the United Nations. The revised Statistics and Data Science Pipeline has the following tasks:
1. Specification of needs/plan/design/build
2. Collect/process/munge/wrangle etc. data
3. Analysis
4. Interpretation
5. Documentation
6. Reporting/communication
7. Work on a team/Evaluation.
I am working on a new (2025?) book, Ethical Practice of Statistics and Data Science for Public Service which will focus on application of ethical reasoning for local and national government workers. I have 35 cases derived from government settings and have partitioned the cases along the revised Statistics and Data Science Pipeline, so the new book will have a similar structure to the original edition.