Comma Seperated Values

A format that defines values seperated by commas. A commonly used, and widely misused file type.

The CSV format has a number of issues that can trap the unwary. The most obvious is that the term defines a format not a data form. That is it tells us how the elements are to be written but does not say what is included. To be usable the data must be further described by stating what the columns mean, which columns are optional and which are required.

The next issue is how entries which include commas are to be included. One common approach is to quote entries with a double-quote character, so that each entry can include commas (and possibly new-line characters as well). Of course this just makes it important to be clear how double-quote characters are to be incorporated.

The final issue, and one that occurs more commonly than you might think, is the character encoding to use. As long as standard American English is used this is not often an issue, since the majority of modern encodings use some varient on ASCII and simple characters have the same form in most character sets. However less common characters, such as those with accents or currency symbols (for anything except the Dollar) will typically not translate well from one encoding to another.

