Raw and processed data

Jeffrey Leek, Assistant Professor of Biostatistics
Johns Hopkins Bloomberg School of Public Health

Definition of data

Definition of data

Data are values of qualitative or quantitative variables, belonging to a set of items.


Set of items: Sometimes called the population; the set of objects you are interested in

Definition of data

Data are values of qualitative or quantitative variables, belonging to a set of items.


Variables: A measurement or characteristic of an item.

Definition of data

Data are values of qualitative or quantitative variables, belonging to a set of items.


Qualitative: Country of origin, sex, treatment

Quantitative: Height, weight, blood pressure

Raw versus processed data

An example of a processing pipeline

An example of a processing pipeline