Articles in Vol. XXIV, No. 2
Photofly from Autodesk - 3D from Photos
Website Review: The British Museum Ancient Civilization Sites for Young People
Website Review: CyArk
Miscellaneous News Items
To comment on an article, please email
Index of Web site and CD reviews from the Newsletter.
Limited subject index for Newsletter articles.
Direct links for articles concerning:
Project Publication on the Web — III
Andrea Vianello, Intute, and Harrison Eiteljorg, II
(See email contacts page for the author's email address.)
In the last two issues of the CSA Newsletter the authors have discussed important questions that scholars must ask before beginning to use a website as a publication medium for a project: impediments to doing so and motivations for doing so. For those scholars who have decided that the impediments are not daunting and have determined their particular motives, it is now time to dive more deeply into the topic. How should the committed scholar proceed?
The first step was mentioned in the last article. That first step is the selection of a Digital Information Officer, also known as the chief geek. It is our firm conviction that the selection of this member of the team is of critical importance because that person will be the one who drives the decision-making process for matters digital. It will be up to that member of the team to ask the right questions about the computing side of the project, propose useful solutions (generally more than one per question), and implement the decisions. As we said then, this person should be brought onto the team at an early stage so that questions that may not appear to require the chief geek's input are not answered too abruptly.
The following are the major questions we see as being the first to be answered after the chief geek has joined the team. Not all of them are relevant to the issue of using the web to publish project results; so not all will be dealt with here. Those marked with asterisks are both important and relevant to the use of the web as a publication medium. While the others are also important, they are not relevant to the use of the web as a publication medium.
In this article, we will focus on the last eleven of the above questions. Not all of these issues will be germane to all scholars embarking on the construction of a web site as the project publication medium. Some will not wish to use the site for administrative matters at all. Others may choose to omit all preliminary publications from the web site. Nevertheless, we will endeavor to provide some thoughts about all these issues. Note that the issue that may be most contentious -- final reports and analyses -- has been omitted here. It will be the subject of our last article in this series, also in this issue, but as a separate article.
How will data be archived?
Sadly, we cannot today make the claim that archival preservation is either universally available or easily obtained. In fact, it may be impossible to find a suitable repository for the files. As a result, it may be necessary to prepare archival copies of files and store them locally until a suitable repository can be found. In such an eventuality, funds should be retained to pay for archival preservation when it becomes available.
Will access to data be provided to website users in a fine-grained fashion (access to individual data items) or will access only be to complete files?
It is tempting to leave this discussion and simply move on, offering no opinion for today's project directors. However, we believe there is a prudent course. That is to offer only complete files. If and only if the project has an unusually large budget for this work and a serious interest in the processes and procedures — both intellectual and digital interests — of creating fine-grained access systems, may the choice to provide fine-grained access as well as the original files be justified. Note that we do not offer a choice of fine-grained access only. We believe firmly that the original files must be preserved for access by any scholar wishing to understand fully the methods as well as the results of a project. If they must be preserved, we see no reason not to make them available. Fine-grained access can be provided at any future date, but providing access to the full file systems of a project cannot be reconstructed from the data extracted for fine-grained access. (Note, however, that we have not spoken to the timing question here.)
The issue of file formats for those files to be shared is important here. In general, the formats supplied should be the most widely used non-proprietary formats available. That does not mean that the original files — in whatever formats — cannot be made available as well. In fact, we would recommend that the original files be made available whenever possible along with equivalent files in non-proprietary formats. When non-proprietary formats are not available (e.g., for CAD files in the DWG format), there is little choice but to make them available in the format the project has. This is one of the reasons an archival repository is a good idea. The people who operate such repositories will be better able to deal with these questions than the best of the chief geeks for individual projects.
Will preliminary data be posted on the web for access by people beyond the project team?
Those inclined to risk the adverse impact of making preliminary files available should make every effort to verify the accuracy of the files made public before releasing them. In many instances, that will be sufficient. But it will not be easy. (It is prudent to check data files in each off-season span anyway. So this is simply another reason for checking data files between work seasons.)
The temptation for those who do not wish to take the risk of making preliminary files available is to delay the day of reckoning, the day when the files are finally deemed ready for sharing. It is difficult, after all, to let the information become public and to risk the inevitable: having your errors pointed out. Thus, the project that decides to delay release of data files should impose a deadline of some sort so as to prevent unending delays.
What kinds of files are expected?
Given the assumption of digital photography by many, how will photographs be treated?
Photographs should have accompanying information. For every photograph a photographer, the date, and something about the subject(s) should be known as the bare minimum. For photographs of work in progress, information about the viewpoint should also be included, and time of day would often be helpful. More information would be desirable, e.g., camera, lens, and various technical bits, but these items may be safely omitted. Imagine, however, having a photograph from a casual visitor to the project in the project storehouse and not knowing if the project has permission to use it? Similarly, one might imagine a photograph of staff members from a given season without all names or without a date. At some point in time, presumably rather far in the future, such a photograph will become all but useless. (It should go without saying that photographs should include scales and color charts when appropriate.)
Photographs may present problems because of the number of people on a project who may be taking photographs for their own purposes. Should those photographs be included in the project corpus? We do not think this is a question susceptible to a one-size-fits-all answer. A policy should be in place, and there should be some simple procedure for downloading photographs to the project's master storehouse (with requisite information about photographer, camera, lens, etc., as determined by the project staff), and care should be exercised to be sure the procedures are followed. In addition, nobody should pretend that all photographers will know and use the project's preferred standards. Therefore, there should be a process for accepting photographs that are considered important but that do not meet project standards. (It may be argued that there should also be a blanket permission form providing the project with the right to use any photograph accepted into the storehouse.)
What forms of indexing and/or searching will be developed?
Will there be a private section of the website for team members?
Assuming use of the website for various functions (e.g., volunteer information) during the life of the project, will all organizational web pages be archived, some portion, none — with all revisions, last revision only, major revisions only, . . . ?
Will preliminary discussions and analyses, comparable to field-season reports, be posted on the web for access by people beyond the project team?
One approach is to add notes to the original documents (perhaps color coded?) so that new readers of old documents will not be unnecessarily confused and so that people re-reading the older documents can see clearly where and how changes have been brought into the discussion. This is time-consuming and organizationally complex. Therefore, an alternate might be to include, in the introduction to any new document, notations of changes from previously published works, leaving the original works untouched. That risks allowing the originals to seem still to be authoritative, however. Ideally, there should be some notation in the originals to point out changes, even if that notation is only a link to the updated document and some standard note (e.g., "New information has obliged us to revise statements made here. See . . . .") While the authors believe that some form of notation in old and outdated documents is required, the bare minimum is a more complex indexing form than the norm, one that includes information about documents that have been updated or are themselves updates of other documents.
It may be argued that this kind of revision is so familiar to scholars that it is unnecessary to change practices with a movement to the web as the publication medium. Notations of change are certainly not required in the print world. On the other hand, most authors, when proposing any important change of interpretation, will sketch out the background in the course of discussion, complete with bibliography. We are simply arguing here that some such awareness of the potential for confusion be built into the system.
Will web pages aimed at the general public be a part of the site from the beginning or only when analysis has been completed?
These are all interesting questions that cannot be given a simple answer. The portion of the site devoted to the general public should not, in our view, be so much a section of the website as individual pages that are aimed at the public — and marked as such in menus or indices — and that lead both to other pages intended for general audiences and to the scholarly portions of the site. The general public should, when so inclined, be able to dig as deeply into the material as any scholar, but the starting points will be different.
The more complex the project, the more complex these pages will need to be. At the least, we believe that Peter Young (see reference above) had it right; the general public should be able to determine the aims, methods, and rationales for the project. We must be able to explain what we do, how we do it, and why we do it to an audience broader than our colleagues. At the end of the day, the support of the wider public is required.
As to the question of waiting until the end of the project to put up web pages for the general public, we think that is a mistake. The more important and far-reaching the project, the more desirable it will be to speak to the public from the beginning. That, however, may raise some difficult archiving questions. If the web pages intended for the public are prepared early in the history of the project, they will surely change — and probably change often. Should those various versions be preserved? It seems to us that they should. They may well represent a very interesting and useful version of the history of the project. Therefore, we favor preserving all versions of these documents except those with only trivial changes (easy to say, harder to define, but basically non-trivial changes are those that impact meaning), in which case the last version should be the one preserved. Once the project and the site are complete, there should be some narrative to explain the evolution of these documents. That would aid both scholarly and general readers who wish to understand the project as whole.
It would be foolish to pretend that the foregoing discussion has been more than an outline of the issues to be confronted. The individual problems discussed, however, should provide a handy outline for the chief geek and others as they approach the work of publishing a project on the web.
-- Andrea Vianello and Harrison Eiteljorg, II
For additional articles in this series, please see "Project Publication on the Web — IV," in this issue at csanet.org/newsletter/fall11/nlf1103.html; "Project Publication on the Web — Addendum," by Eiteljorg at csanet.org/newsletter/winter12/nlw1205.html; and "Project Publication on the Web — Addendum II," by Vianello at csanet.org/newsletter/spring12/nls1204.html.
All articles in the CSA Newsletter are reviewed by the staff. All are published with no intention of future change(s) and are maintained at the CSA website. Changes (other than corrections of typos or similar errors) will rarely be made after publication. If any such change is made, it will be made so as to permit both the original text and the change to be determined.
Comments concerning articles are welcome, and comments, questions, concerns, and author responses will be published in separate commentary pages, as noted on the Newsletter home page.