Upon entry into practice, all professionals assume at least a tacit responsibility for the quality and integrity of their own work and that of colleagues. They also take on a responsibility to the larger public for the standards of practice associated with the profession.
Envisioning the Future of Doctoral Education: Preparing Stewards of the Discipline—Carnegie Essays on the Doctorate, edited by Chris Golde and George Walker
Ethical Practice of Statistics and Data Science is intended to describe these responsibilities and prepare people to fully assume them for statistics and data science. This book would be useful for those about to embark on a career, but it is also intended for practitioners and mentors or supervisors of practitioners. This book supports the ethical practice of statistics, data science, and “statistics and data science,” with an emphasis on how to earn the designation of – and recognize – “the ethical practitioner.”
Rationale and Scope
Currently, the ethical use of statistics and data is simply not taught, nor is there any emphasis on these critical concepts in statistics or experimental design courses taught by other disciplines (e.g., biology, psychology, policy, business, economics, etc.).
Ethical Practice of Statistics and Data Science offers an alternative perspective. Organized around the following seven easily recognized tasks, the book prioritizes ethical and professional behavior when using statistics and computing:
- Planning/Designing
- Data Collection/Munging/Wrangling
- Analysis (Perform or Program to Perform)
- Interpretation
- Documenting Your Work
- Reporting Your Results/Communication
- Engaging in Team Science/Team Work
The book features ethical reasoning and, rather than using examples from the news, it focuses on what people do and the choices they make—whether they realize/recognize/acknowledge them or not—during practice with data. This is true for statisticians, data scientists, people doing AI/machine learning and algorithm building, and those who just use these techniques and technologies in the course of their other work.
Moreover, the book shows how to reason ethically—and practice ethically—with professional practice standards from the Association of Computing Machinery and American Statistical Association, which articulate that their guidelines are intended for all who use their methods/techniques or approaches, whether or not they are members of the organization or have formal training in the domain(s). The reasoning paradigm can be applied to any practice standard, regulation, or law, but the ASA and ACM guidelines are specifically suited to helping any practitioner at any level to promote a more ethical data-centered world.
Additionally, Ethical Practice of Statistics and Data Science includes two ‘checklists’ (not professional practice standards; not consensus-based): the “data science ethics checklist” and the UK government’s “data ethics framework.” Also featured are 47 vignettes organized by task, so there are 5–9 vignettes per task.
Readers need not have completed an introductory stats course to benefit from this book; however, readers who are embedded in, or who have completed, an undergraduate or graduate degree in statistics, biostatistics, or data science—any data-intensive program—really need to understand their responsibilities to practice ethically. They will benefit more from Ethical Practice of Statistics and Data Science because they may have had more experience with data and the challenges of interpreting and contextualizing conclusions/making arguments about or with data. Thus, managers or others leading data-intensive teams would also benefit greatly.
Readers need not have already read Ethical Reasoning for a Data-Centered World to benefit from this book, but reading both wouldn’t go amiss (particularly for management/leadership students).
Suggestions for integrating this material can be found in “Ten Simple Rules for Integrating Ethics into Statistics and Data Science Instruction, 2E.” The seven tasks and 47 vignettes are authentic, so an instructor who has never engaged in any of the tasks will still be able to participate in discussions about ‘what to do’ and, specifically, what the guidelines support practitioners doing in a variety of situations.
Table of Contents
List of Cases | ix |
Introduction | xix |
Section 1: Ethical Reasoning KSAs, the ASA Guidelines, ACM Code of Ethics | |
Chapter 1.1 Ethical Reasoning (ER) Knowledge, Skills, and Abilities (KSAs) | 1 |
Chapter 1.2 The ASA Ethical Guidelines for Statistical Practice | 11 |
Chapter 1.3 ACM Code of Ethics | 24 |
Chapter 1.4 The Data Science Ethics Checklist (DSEC) and Data Ethics Framework (DEFW) | 39 |
Chapter 1.5 The “Universe of Statistics and Data Science”: Tasks | 52 |
Chapter 1.6 Exploring Guidance from ACM, ASA, DSEC, & DEFW | 60 |
Chapter 1.7 Augmenting Your “Prerequisite Knowledge”: Stakeholder Analysis | 92 |
Section 2: Stewardship of the Profession: Prerequisite Knowledge | |
Chapter 2.1 Introduction to Section 2 | 101 |
Chapter 2.2 Planning/Designing | 167 |
Chapter 2.3 Data Collection/Munging/Wrangling | 175 |
Chapter 2.4 Analysis (Perform or Program to Perform) | 181 |
Chapter 2.5 Interpretation | 188 |
Chapter 2.6 Documenting Your Work | 195 |
Chapter 2.7 Reporting Your Results/Communication | 201 |
Chapter 2.8 Engaging in Team Science/Teamwork | 207 |
Chapter 2.9 Summary of Section 2 | 213 |
Section 3: Using All Six Ethical Reasoning KSAs with the GLs/CE in Practice (47 Case Analyses with Discussion) | |
Chapter 3.1 Introduction to Section 3 | 217 |
Chapter 3.2 Planning/Designing | 231 |
Chapter 3.3 Data Collection/Munging/Wrangling | 274 |
Chapter 3.4 Analysis (Perform or Program to Perform) | 325 |
Chapter 3.5 Interpretation | 399 |
Chapter 3.6 Documenting Your Work | 451 |
Chapter 3.7 Reporting | 511 |
Chapter 3.8 Engaging in Team Science/Working with Others | 576 |
Chapter 3.9 Summary of Section 3 | 643 |
Chapter 3.10 Book Summary | 647 |
References | 653 |