Vol. VIII, No. 2

August, 1995

Humanities Databases - Separating Facts From Opinions

by H. Eiteljorg, II

The use of databases in the humanities continues to grow. Undoubtedly, more and more sets of data will be gathered for use by scholars, and, thanks to projects like the National Initiative for Scholarship in the Arts and Humanities (NISAH), there is reason to hope that those sets of data will be related to one another in useful ways. (For more information about NISAH, see the May, 1995, issue of the CSA Newsletter , or check out http://www.ahip. getty.edu/ahip/home.html on the Web.) [This web site is no longer active. - ed.]

Another important aspect of database design, however, has been less often considered, and this aspect may ultimately have as much to do with the usefulness of data sets for scholars as the way those sets are related to one another. This aspect of database design might be called the fact-or-opinion question or perhaps the contested-data issue. The problem is inherent in humanities scholarship, because there are differences of opinion about so many matters of supposed fact. It was brought to my attention recently when I reviewed A Guide to the Description of Architectural Drawings, by Vicki Porter and Robin Thornes. This new publication is intended, as the title indicates, to help those who catalog architectural drawings in computer databases. In this generally full and thoughtful scheme for describing drawings there are no provisions for handling contested or differing interpretations, as if all decisions were incontrovertible.

Differences of opinion regarding items in a data file are not only common, they are often significant and may involve very basic items of information. For instance, a drawing, a painting, or a piece of sculpture may be attributed to one draftsman or artist by a certain scholar; to another by a different scholar; to the office, school, or workshop of a named person by a third; and so on. Similarly, the date of a specific item may be designated by one scholar according to a chronological scheme not accepted by another.

Since many attributions, dates, and other factual data are decided differently by different scholars, the facts inserted into a database might be different if two different scholars were preparing the data. Therefore, ought a database include as part of its data the sources and other information required by a user who needs to see the full complexity of the data? For instance, should a museum catalog or an excavation catalog, indicate that the date for an object has been given by a specific scholar? Or, on the contrary, should the date appear to be as secure as if it were an immutable fact, ordained by the deity? Ought an attribution to an artist be noted as reflecting the opinion of a specific student of the work? Or should all attributions carry the weight of the artist's signature, duly notarized?

Uncertainty is not the issue here; it is not difficult to indicate that a date is vague or uncertain. Nor is it a problem to indicate that the creator of an object is not known. But it can be difficult to indicate differences of opinion.

SIBYL, a. database of classical iconography based on material from the U.S. Center of the Lexicon Iconographicum Mythologiae Classicae (LIMC), is one database project that was well planned to deal with such scholarly disagreements. Attributions, painters of specific vases, for instance, have been put into the data files, but each attribution carries with it the name of the responsible scholar. Equally important, there may be competing attributions for a single object, each linked to the scholar who proposed it. The careful design of the SIBYL files preserves the full complexity of the scholarship and meets the needs of a wide variety of users.

Many scholarly databases in the humanities were planned before there were thoughts about connecting them to other data sets; they were created by and for rather limited groups of users. Thus, it often seemed unnecessary to make explicit the distinction between opinions and facts when the group of users was so small. But the problem cannot be ignored as we move to put more and more of these data files onto networks and to connect them to a wide array of related data sets. Data - "facts" - will be retrieved from many sources to augment other data files; so the information in any file that may be accessed should be as well documented and supported as the information in any other file. The weakest link in the chain will be truly crucial, especially when the user cannot separate the links but sees the chain as a seamless whole.

Data sets must be designed so that the differences between facts (e.g., recorded dimensions or weight of an object) and opinions (e.g., the recorded date of an object) are absolutely clear, and that distinction must be built into the organization of the data if it is to be clear. In addition, the sources for the opinions should be equally clear and must also be integrated into the design of the database. Designing such a database structure is not difficult, but neither is it trivial. Altering existing data sets to include this distinction, however, is both more difficult and more time-consuming than building it in at the outset. More important, connecting ill-structured data sets to networks so that they may accessed by the unwary could make them truly harmful.

This is not an argument for stopping the work on data gathering or the building of data sets. Rather, it is an argument for thinking more clearly about how data items are tagged to indicate their sources, and it is an argument for making a concerted effort to provide full and complete data rather than selected opinions. This is also meant to be a reminder that even small, restricted data sets may be connected to other data sets almost anywhere; indeed, the better designed data sets certainly will be connected to others, sooner or later. Therefore, any database should be designed to an appropriate standard.

For other Newsletter articles concerning the use of electronic media in the humanities, consult the Subject index.

Next Article: Saving Our Nation's Cultural Hertiage

Table of Contents for the August, 1995 issue of the CSA Newsletter (Vol. 8, no. 2)

Table of Contents for all CSA Newsletter issues on the Web

Go to CSA Home Page