What is data literacy?

Let's build a shared understanding of data literacy.


Ok, I started out trying to define "data literacy". I can't get a perfect definition that I am satisfied with so I will ramble a bit.

Data literacy is useful skill for problem-solving. When we collect data for a purpose with the purpose in mind we can use the data to discover underlying relationships that are beyond mere observations. One of the limitations is bias. When we recognize the possible bias from the outset we can address it in both our calculations (analysis) and our conclusions.

Transforming observations and information into a useful set of usable data. With this data when can perform exploratory calculations and in many cases create a mathematical model that represents this data. The wonderful thing about this process is we can discover relationships. We may also discover phenomenon that we didn't expect. For example, through the use of residuals we can discover hidden influences. We can also discover that many relationships are not linear. Once we have a model, we can use that model to make predictions and as time goes on we can modify the model to improve the "fit" of the model.

So what does this mean?

Data literacy is the skill of identifying meaning from data. This amy require knowledge of mathematics and/or statistics depending on the type of data, the type of analysis conducted, and the complexity of both.

Data literacy is the ability to accumulate and put to good use a set of data. It is also the ability to understand the data that has been organized by others. This typically involves a proficiency in the construction and analysis of graphs and tables, an ability to compress large data sets or comprehend how others have done so, and to see relationships as they exist in the data. A certain amount of mathematical fluency is important for this. I would also say that good data literacy also involves an ability to communicate what is shown by the tables, charts, and graphs (through labeling and/or summarization).