Course Description

Provides an intensive introduction to applied statistics and data analysis. Trains students to become data scientists capable of both applied data analysis and critical evaluation of the next generation next generation of statistical methods. Since both data analysis and methods development require substantial hands-on experience, focuses on hands-on data analysis.

Course objectives

Upon successfully completing this course, students will be able to:

  1. Formulate quantitative models to address scientific questions
  2. Obtain, clean, transform, and process raw data into usable formats
  3. Organize and perform a complete data analysis, from exploration, to analysis, to synthesis, to communication
  4. Apply a range of statistical methods for inference and prediction
  5. Build data science products that can be used by a broad audience


This course is designed for PhD students in the Biostatistics department at Johns Hopkins Bloomberg School of Public Health. It assumes a fair amount of statistical knowledge and moves relatively quickly. I am open to anyone taking the class, but since it is a core requirement for our PhD program I will not be slowing down or allowing auditors for the class.

Course Information


We use DataCamp for our assignments. Please visit the Advanced Data Science link for the assignments

John’s email is:

Stephen’s email is:

Typos and corrections

Feel free to submit typos/errors/etc via the github repository associated with the class:


We would like to thank DataCamp for providing us with a great platform on Grading and Assignments for this course! Check out their courses and DataCamp for Education to use their platform for your course.