Beschreibung
Data is overrated. An analysis of the promise of data science to drive better decisions.
Data science drives better decisions. That is the promise of many data science textbooks. Other concepts see data science as the abandonment of hypotheses, models and causality – statistical correlation replaces all that.
Decisions, however, are not just at the end of the data science process; they run from the initial questions, to deciding what to consider as data, to technical issues of data modeling and analysis. Decisions here depend on purpose, question, and context.
I argue that data, as used in databases and data models, should themselves be considered as models and outcomes of modeling processes. They are created and structured artifacts that were generated under specific conditions and can provide precise answers under specific conditions.
Data create their own universe with their own rules of meaning and validity. “Data is overrated” is directed both against the notion of data as a royal road to better decisions and against the notion of a bias that subsequently distorts data and that could be eliminated by reducing decisions.
pdf, english, 100 pages