Search this site with Google

A hierarchical scheme of classification. Usually a valid taxonomy must be complete (every item is classified) and unambiguous (no item has more than one classification)

Valid and Invalid Taxonomies

In the figure above the classification schemes on the right is invalid because it is:

  • incomplete: Germany has no regions and Australia is not within a continent
  • ambiguous: Alaska is both part of the USA and a region in its own right, and Ireland is within both the UK and Eire

The standard validity rules for taxonomies can be thought of as defining a diagram like the above. Any valid taxonomy will contain no gaps and every item will be a rectangle.

If the set of classifications does not follow the above rules it can still be thought of as an ontology.

Number of levels

The most common type of taxonomy has only a single level, it is a list of the valid values that an attribute can have. In this case there are no issues with the mapping between values, since the only relationship between values is mutual exclusivity.

A Location Taxonomy

When there are multiple levels the relationship between levels can be thought of as either being "part of" or "type of". For example in the countries example above "England" is part of the "UK" which in turn is part of "Europe". The physical nature of the classification makes it easier to ensure that all the applicable items are included.

Other taxonomies could be based on the "type of" relationship, for example a "Tiger", "Lion" and "Panther" are all types of "Cat", which in turn is a type of "Mammal". In this case there is a widely understood language that defines concepts such as "class", "family", "genus" and "species" and gives names to particular examples. Where a widely understood classification scheme already exists it should used, unless there is a compelling (and documented) reason to avoid it. Often it is necessary to slightly adjust such schemes, for example when defining animals it may be better to adopt the common name for them rather than the more strictly accurate latin one. In such cases one should be extremely careful to document the modification to the standard scheme that has been adopted.

Defining a taxonomy

The process of defining and maintaining a complete taxonomy is usually time consuming and expensive.

The most formal approach is to engage specialists that will gather input from all the stakeholders, analyse to identify the best scheme and then publish detailed instructions for utilising and maintaining the scheme. This requires investment at the start.

The most informal approach is to get users to submit there own tags, a folksonomy as it is known. This will usually result in a set of categories that are inconsistent, incomplete and ambiguous. The costs here are incurred later in the project (and are often many orders of magnitude greater than getting a team at the start would have been).

Most projects fall somewhere between these two extremes.

Links to this page

The following pages link to here: Classification, Coverage, Folksonomy, Hierarchical, IA Rendition, Ontology, Reference Integrity, Relationship

Comment on the contents of the 'Taxonomy' page
Subject: Email to Reply To (optional):