MARC issues

From MARC must die!

Jump to: navigation, search

MARC issues

Value Interdependence

One of the main issues of MARC format is that the interpretation of a data value can be influenced by other values from other fields and subfields. For example, the meaning of the value from subfield $b of the field 245 is determined by the subfield $a from the same field. If the value of the subfield $a ends with an equal sign the content of the subfield $b is the parallel title (such as a title in different language). Conversely, when the last character of the subfield $a is a colon the value of the subfield $b stands for a subtitle.

The value interdependence can be shown on another example from MARC format for authority data where the interpretation of the value of the field's 550 subfield $a is specified by the subfield $w of the same field. In case the $w contains value "g", the content of the subfield $a describes a broader subject heading, if there is a value "h" in the subfield $w the content of the subfield $a stands for a narrower subject heading, and if the subfield $w is blank it means that the subfield $a should be interpreted as a related subject heading. These examples demonstrate that due to the influence of a values on other values it is generally difficult to determine the right way of interpretation of data elements in MARC format.

In MARC the data values are mixed in with other text. For example, MARC does not have a single field for ISBN, but it has field 020 in which you can find ISBN and, in addition, some other text (e.g., "(pbk.)"). Another frequent case is combining data values with punctuation. This combination can be found in the data elements like the aforementioned subfield $a of the field 245 in which the punctuation symbol at the end of the value determines how to read the proceeding subfield. These added characters either qualify the value in the data element in which they can be found (in case of ISBN, the "(pbk.)" code says that it is the ISBN for the paperback edition), or, they modify the interpretation of the value that directly follows (which is the case for field's 245 subfield $a). It turns out that these symbols are essential for the correct interpretation of MARC data. However, in the process of conversion to non-MARC formats these characters need to be cleaned out.

Readability & Efficiency

MARC is neither easily readable nor efficient in data storage. The standard require some values to be entered multiple times in different data elements which creates redundancy. For example, the publication data can be found both in the field 008 and the field's 260 subfield $c. The duplicated input may result in errors so that the data elements of the same type contain contradicting values. The fact that the same data are scattered in multiple places makes updates difficult because the same update needs to be applied multiple times.

MARC also suffers from issues connected to using words as identifiers.

Features such as the ones that have been mentioned entail that the use and development with the MARC format is hard, time-intensive, and therefore costly.

Personal tools