Data Quality

A measure of how fit a set of data is for its intended use. An ambiguous term that refers to a number of different measures, usually split into:

The first of these measures is the most difficult to estimate. There are a number of techniques for measuring and correcting data quality issues. Usually the more automated approaches are the most effective.

Six sigma based methods have provided a number of tools to measure data quality.

Improving data quality costs money and effort, but bad quality data imposes a "data tax" on the user. There is a balance point at which these two are equal.

For many purposes it is more important to note a realistic estimate of data quality than to have 100% correct data.

