Raw and processed data

Jeffrey Leek, Assistant Professor of Biostatistics
Johns Hopkins Bloomberg School of Public Health

Definition of data

Definition of data

Data are values of qualitative or quantitative variables, belonging to a set of items.

http://en.wikipedia.org/wiki/Data

Set of items: Sometimes called the population; the set of objects you are interested in

Definition of data

Data are values of qualitative or quantitative variables, belonging to a set of items.

http://en.wikipedia.org/wiki/Data

Variables: A measurement or characteristic of an item.

Definition of data

Data are values of qualitative or quantitative variables, belonging to a set of items.

http://en.wikipedia.org/wiki/Data

Qualitative: Country of origin, sex, treatment

Quantitative: Height, weight, blood pressure

Raw versus processed data

An example of a processing pipeline

An example of a processing pipeline