Vol. XXI, No. 2
CSA Newsletter Logo
September, 2008

Web Site Review: The International Dunhuang Project

Susan C. Jones
(See email contacts page for the author's email address.)


CSA Newsletter Comment


The International Dunhuang Project (IDP)


For centuries, the Silk Road was the major overland conduit between the Mediterranean world and that of eastern Asia. It was not a single pathway, but a series of interconnected, overland trade routes that crisscrossed the daunting terrain of the deserts and mountains of central Asia. The area around the oasis town of Dunhuang (Tun-Huang) on the edge of the Gobi Desert was the eastern terminus for several of the individual routes. Positioned at this strategic location was a complex of caves that provided facilities for the local Buddhists and travelers along the Silk Road. These caves, called the Dunhuang or Mogao Caves, are a cluster of 700 grottoes containing cells and sanctuaries dating primarily from 386-1368 c.e. (the Northern Wei Dynasty through the Yuan Dynasty) with the earliest dated to 366 c.e. Four hundred ninety-two of these contain murals and sculpture. The complex also was the site of a major repository for Buddhist manuscripts. When the site was investigated by Sir Aurel Stein in the early 20th century, this library contained over 40,000 examples of Buddhist scriptures and art that encompass approximately a millennium of Buddhist history. Worldwide "redistribution" of the artifacts to numerous museums and private collections began almost immediately and continued for most of the century. UNESCO declared the Mogao Caves a World Heritage site in 1987.2

Figure 1 - Map of the eastern Silk Road, with Dunhuang circled.3

The International Dunhuang Project (IDP) is an ambitious effort to conserve and provide access to artifacts and archaeological material associated with the eastern Silk Road. The IDP is the result of a 1993 international conference on the problems of preservation of and access to the cultural treasures of the Silk Road in general and the Dunhuang library in specific. Note that the IDP project has a greater scope than just the Dunhuang area and the library; it is the entire eastern Silk Road. (The geographical area is as poorly defined as the Silk Road itself. Figure 1 covers slightly more territory than the verbal description of the eastern Silk Road that accompanied it. For convenience I think of the eastern Silk Road as that portion which ran through what was predominantly Buddhist territory in the first half of the first millennium c.e., roughly Samarkand to Dunhuang.)

The web site under review is the public face of the project. The site provides educational and research resources for students of all ages and levels of scholarship and access to a database of the artifacts and sites along the eastern Silk Route.

Organization and Navigation of the site

The IDP site is a large, complex site that takes full advantage of the non-linear approaches to information possible on the web. In addition to the implicit complexity of the content, all material is, or will be, available in each of the 5 languages of the member institutions: English, Chinese, Russian, Japanese and German. I shall be reviewing only the English pages.4

Organization of the site is straight-forward, with each major section accessible through the menu always visible on the left of the screen; each major section has a submenu that appears as the cursor glides over section name in the left-hand menu. Access for the major sections is also available from a menu across the bottom of every page. There is also access to a Google search of the site available on each page.

Figure 2 - The IDP home page with menus along the left side and as a footer for the
page. The Google search window is in the right margin.

I will use the following terminiology to distinguish the multiple organizational levels of this complex site; from most general to most specific they are section, subsection, division and subdivision. The four major sections are (1) an introduction, (2) access the database, (3) "My IDP," and (4) "Support IDP."

Each page is dated in the header with both the date of the original mounting and the last update. The header also shows the simplest hierarchical path to that page from the homepage. I found this helpful both to keep me oriented and to find a given page again. Since cross-links among pages take complete advantage of the relational aspect of research on the web, the exact path by which any given page was reached may be impossible to duplicate. In other words, the overall structure of linked pages is far from hierarchical and you need not reproduce last week's thought processes to find the same pages today.

I do have a minor quibble with the bibliographies. I find it peculiar that this cutting edge internet resource chooses to separate so completely internet resources from printed ones. They are not on the same page; it is only through the ever-present menus that you can get from one list of resources to another.

When I first tried the bibliographic search available through the database in mid-June, it did not seem to function well. I tried to find references containing the words "sutra," "wood" and "thangka" in any field, nothing was found. The Show All bibliographies was also not working on this date. It returned the message, "Active4d error - reference to an undefined variable or value." When I tried both features again in mid-August, they worked. This experience both underscores the problems of reviewing a work in progress and emphasizes the fact that the work is ongoing.

I find it refreshing that the authors understand the frustrations that can accompany getting acquainted with a complex site. They try to soften it with occasional humor. For example here is one of the FAQs with its answer -- "Q29: I would like a cup of coffee/glass of vodka. A: Sorry, not available on IDP. Find a local café/bar with wireless internet and carry on browsing. Lots more to look at."

The International Dunhuang Project Database

Figure 3 - The full database structure as documented on IDP site. Figure 4 - A detail of upper right corner of full database structure.

This is a relational database originally patterned after DBMSs used in manuscript catalogs. It has been expanded and redesigned to include paintings, textiles, historical photographs, objects and archaeological sites. The DBMS is 4th Dimension, software with a large user community and a long enough history to assure its continued availability and support. The database itself conforms to TEI standards (http://www.tei-c.org/index.xml), Dublin Core criteria http://dublincore.org/) and other international metadata standards used in specific areas. Our readers are probably familiar with the practical issues raised by using the Dublin Core (H. Eiteljorg, II, "Where Is the Information?" CSA Newsletter X.2, Fall 1997: http:/csanet.org/newsletter/fall97/nlf9707.hmtl), but the fact that the IDP uses existing standards is laudable.

It is difficult to access the search mechanisms of a database if it is not yet complete. Many of the observations given below may only be temporary problems that have been deemed acceptable at this time to get the public interface to the database running. (See my experience with the bibliographic search feature above.) The layout has no obvious flaws. Overall I found the quantity and quality of information presented to be formidable and better than expected for most online projects of this scope.

I want to emphasize that the IDP is, or will become, incredibly rich in primary source material before I discuss problems and shortcomings that I ran across in the database. The images are of exceptional quality. Data for most items were not yet completely entered. Those items that I encountered whose data appeared complete provided a wealth of information about the objects and a translation of any text. (See the entry for British Museum Or.8210/S.328 Recto, a scroll containing an historical romance dealing with a Zhou dynasty minister, Wu Zixu.) If I had the ability to read the Chinese characters, its image is clear enough to allow me to do my own translating. A complete image of its verso is also presented, even though there was nothing visible on it. The text accompanying the image includes a complete English translation of the text, a catalog entry with brief textual criticism and a bibliography. The layout of the database files and their fields show that information, such the date of the scroll, that was missing from the current screen display will be part of the actual database files. That said I will discuss some of my frustrations with the IDP database.

The Search Tips page outlines the permissible terms in each form of search. It also states repeatedly that many of the indexed field terms are intended as "rough guides" and that some database records have not yet been indexed. It recommends that Free Text search be used for specific values if nothing is found using the Advanced Search predefined/indexed values. It also states that indexing is currently incomplete. Not all records are indexed and the pull-down lists do not include all the values that can be found.5 What follows are my experiences with specific searches.

A search on "sutra" using the direct search selection from menu on the left side of the screen produced 186 items. This search is defined as a "contains" search on the entered text occurring in the pressmark and indexed fields. Each object (or using IDP terminology "item") in the resulting list appears with a thumbnail image, data on its current location, original findspot, its language, and its medium. There were further links to bring up additional details of each sutra fragment or its findspot. For these items, details include copyright information for the image, format of the item (scroll, pothi, sutra-wrapper, etc.), a transcription, a translation, bibliography, data on the findspot and a "thank you" with personal dedication for any special funding used to process the object. This display does not include all the data suggested by the fields seen in the database layout, only a minimal set that reflects the importance given by the project leaders to various categories of data.

The link to the findspot of any "sutra" gives a site plan, latitude and longitude, a brief description, photographs, dates of the site, a site report, a bibliography and the option to search for other finds from the site. Only the "top 300 finds" for any site are displayed, and I could not discover how to access other finds. Equally frustrating, the definition of a "top find" is not part of the display of the results of the query. Nor could I find a definition anywhere else on the site. The top of the "results" page shows a count of the items found followed by a listing of no more than 300 of them. The order of this listing is also not specified, nor can it be changed.6 In a search for references to a specific sutra or grotto's artifacts, this limit of 300 may make some sense, but when searching for all the artifacts at a given archaeological site, the limit is frustrating, and the inability to access anything other than the top 300 items inhibits research using the database. The necessity to divide a request into individual searches that each provide fewer than 301 objects seems a step backwards and can lead to overlooked and duplicate artifacts. It also puts the onus of developing a sophisticated search procedure on the researcher. The only advice given among the FAQ (Q13) is to further limit the search. With only the indexed fields' circumscribed lists of "rough guides" for searches on these fields possible, the more precise search must include either other indexed fields or the free text search.

Not all data on the screen display have been entered for all items yet; I have no knowledge of data present on the database layout but not displayed on the screen. It would be nice to have a date recorded for most recent update for information displayed on the screen. This information is recorded in the database but not presented to researchers. The date of entry may not be thought of as useful during the development phase when most data has yet to be entered, but when the database is "complete" a date could indicate the necessity for an extramural search for more recent bibliography. It would also be useful whenever an alternate interpretation has been submitted and accepted.

Figure 5 - Screen seen when asking for details on a sutra-wrapper (British Museum: MAS.858).
Figure 6 - Screen seen when asking for details on a sutra scroll (British Museum: Or.8210/s.1 Recto).

As you can see from the above screens, neither artifact is dated, a potentially major oversight for any research involving textual criticism or more generalized historical/archaeological research. There are fields for dating sites, artifacts and manuscripts in the database layout; no dates pertaining to the items appear on screen forms. It is possible that these objects have not yet been, or can not be, dated, but there is no indication on the form that a date will be added. (There are labelled areas on the screen form for the images, translation and bibliography.) Several entries did include a date in other fields or as part of the image. One instance showed an image of the sculpture in its museum display that gave a date (British Museum MAS.1069, a sculpture fragment with a camel's head and neck). Here the date was that given the artifact by museum personnel, not IDP personnel. Another item contained a date in the descriptive catalog field; this only time that I saw a date displayed that had been assigned by the IDP staff, usually there was no date displayed on the screen. Production dates are also not displayed for manuscripts although they are as important for documents and manuscripts as for artifacts, and the database layout includes production date fields.

Editing of the records seems lax. For example, the search for "bowl" under Form produced a wooden bowl fragment, British Museum Or.8212/105, with its material listed as "ink on paper." The "item overview" detail correctly provided its material as wood with a Uighar inscription. Similarly a request for "figurines" under form produced 68 figurines and a woven wool textile fragment labeled "British Museum: MAS.560, Site: (Yo.0025.b,) language(s)/Script(s): , Materials: clay." When such obvious mistakes are visible, it raises the likelihood that similar, but more subtle errors, exist. The IDP is obviously relying on its users to alert them to these problems. They ask for feedback on the problems and entries, and advertise (FAQ Q16) that they are using a modified Wiki approach to the information on the database, inviting readers to submit commentary and alternative interpretations to their staff. I emailed the IDP about the bowl problem on July 15, 2008; the entry was corrected by July 29 when I called up the item again. Their response to this minor problem was rapid, but I received no acknowledgment that they received my email. Of course I would much rather that they fix the problem than that they tell me that they received my email.

Language issues

The database records seem rightly to use the language of the host institution for that fragment of the distributed database. Displays of search results from German collections at the "item overview" level are in German and not translated into English. Those in Russian collections do not seem to give any detailed information yet; they displayed little more than most fundamental data -- collection, pressmark, site, form, language, material and size. My attempts to bring up detailed information give me only copyright and contact information. When I requested details from those items in the National Library of China I was given a page that displayed copyright information, contact information and a brief history of the collection in English. Those in the Japanese collections and other Chinese collections were like those in the Russian collections. Each item shown on the search results had a place to display a thumbnail image and, if no details were displayed, gave contact information to request them.

The only language issues addressed in the web pages are those of the Dunhuang documents, not those that might be encountered by the user of the web site. For example, the Search Tips do not yet cover any differences in the languages used among the various hosts of the distributed database and how that might affect the responses. The language of the search terms in the direct search does not seem to be restricted, but the issue of whether the outcome of any search may be dependent on language used in the query is not addressed.7 For the Advanced Search form, the indexed fields give you pull-down lists, and those I accessed were in English. For the Free Text search, I presume that since it is a wild-card text search, occurrences found are only those that match characters in the language used for the query. When I used "horse" in the direct search I got 82 items all in the British Museum, London, and Institute of Oriental Studies, St. Petersburg. When I used "pferd@" with a wild character "@" for endings, I got none.

Issues Surrounding Database Structure

The two most disturbing aspects of the database are an apparent lack of uniformity among the data entered for the IDP's numerous collections and an apparent lack of clarity in the definition of terms used in the pull-down lists of indexed values used for advanced searches. I ran across examples of both, although the IDP emphasizes the use of standardized approaches to the data.

An example of the former surfaces with the display for pressmark "CH 652" in the Berlin-Brandenburgische Akademie der Wissenschaften from Toyuk (T III T 262). This is a paper fragment containing chapter 23 of the Yiqiejing yinyi of Xuanying. Its bibliography is listed not under the bibliographic section, but in the catalog section. A bibliographic search does not produce the full citation, and the link leads nowhere. The full citation is found under the catalogs on the Show Catalogues page.

A more complex example is illustrated by a search for silk artifacts at the Victoria and Albert Museum, London. (In an attempt to limit the results to fewer than 300, I performed all searches with the show entries with images only feature turned on; I performed the searches on August 21, 2008.) One of the indexed lists was for holding institution and one list entry is that museum. It was the other criterion, silk artifact, that caused immediate confusion. Silk is a textile, and is mentioned on the Search Tips as one of the media included in "textiles" under the Type of Artefact category, along with hemp, cotton and wool. To confuse matters further, the indexed lists for Form and Subject/Keyword also contain "textile." Furthermore "silk fragments" is an option under form while the other fibers are not. To be sure that I found all the silk artifacts at the Victoria and Albert, I needed to try several options. My searches and their results are listed below.

  1. "textiles" as form and "Victoria and Albert Museum" as holding institution found 519 examples; when I checked a random selection of the 300 "top finds" displayed, they were all silk.
  2. "silk fragment" as form and "Victoria and Albert Museum" as holding institution found no examples ("Silk" is not an option on the pull-down.)
  3. "silk fragment" as form and no holding institution selected found 76 examples; none from the "Victoria and Albert"
  4. No form selected and "Victoria and Albert Museum" as holding institution found 522 examples; when I checked a random selection of the those 300 "top finds" displayed, they were same list in the same order as found under search (1).
  5. "textiles" as form and no holding institution selected found 556 items; the resulting 300 items displayed appeared to be the same as found for search (1).
  6. "textile" as type of artefact selected and "Victoria and Albert Museum" as holding institution found 522 examples; the 300 "top finds" displayed were those found under above under search (1).
  7. a free text search on "silk" found 245 items; when I looked at the list, some items had no apparent connection to "silk" other than the obvious one of Silk Road.
  8. a free text search on "silk fragment@" produced only 4! All at the British Museum.
  9. a free text search on "textile@" produced only 34; and again a brief look at the items showed little connection to "textile."
At this point I stopped formulating searches. The count of 522 textiles in the Victoria and Albert Museum appeared under several searches. I could find at least minimal details about 300 of these. Without further information from an external source it was impossible to frame a query that I could be sure would allow me access to them.

Definitions of the terms found on the pull-downs would be helpful in this situation. Additionally, the same word appearing under multiple categories ("textile" as both a form and a type of artefact) is confusing and should be addressed.

Miscellaneous Observations

The database seems to handle texts and manuscripts better and more completely than objects. This may be only that as an archaeologist I demand more of databases of sites and artifacts than of written documents. I would also suggest that it may be indicative of the project's origins in cataloging the Dunhuang library and the staff's greater experience with manuscripts.

I am frustrated at the number of items that are labeled "digitization in-process" or "digitization not completed." This is to be expected with a project in progress with a goal of 90% of its material available to the public by 2015. It is also indicative of my recognition of the quality of the information that has been made available and my eagerness to use the complete database. I also look forward to the implementation of those ancillary features that are promised with great anticipation.

Free Text searches do not generate search criteria as the direct Search the IDP Database link in the ever-present menu in the left-hand frame. This is explained in the Search Tips, but seems counter-intuitive.

Boolean operations are limited. "NOT" and "OR" are not an options for any search at this time, while "AND" is available only for multiple indexed fields.

There is no glossary, and none seems to be planned. (An internal Google search for mentions of terminology, vocabulary and glossary produced no matches.) If screen displays include an unfamiliar term, other web sites can provide definitions. However for search terms a glossary might provide the necessary guidance for the limiting searches. This lack is an inconvenience for all but especially troubling if the goal is to provide data to inexperienced users and non-specialists. There is a concordance file listed in the database design that seems to address some of the language problems mentioned above.

Requests are sometimes slow to process. This probably only indicates the popularity of the site and the size of the database. I found few such delays late at night or early in the morning.

Funding Issues

The majority of funding is external to the member (host) organizations. These institutions provide primarily space, network facilities and miscellaneous costs, while other institutions and individuals provide funds for cash outlays. The list of supporters has over 130 individuals, foundations and companies who have donated money, time and products. Grants and donations of over GB £100,000 from foundations and nonprofit institutions for specific sub-projects are tallied on a separate page, with the annual amount and the specific project; there are nine such listings as of today. Typical grants listed here include a 1997-2000 National Heritage Memorial Fund, UK, grant of GB £148,000 to digitize and catalog material from Dunhuang in order to launch the website, and two 2006-2009 Ford Foundation grants totaling US $500,000 for training personal and adding lesser known material to the database.8

A quick perusal of the funding listed here shows the tremendous monetary commitment required of a project of this scope. For 2007, there were grants totaling roughly US $560,000. Additionally there were donations from smaller institutional and personal donations that were not quantified, income from contracting out the equipment and expertise acquired by the IDP staff and finally profits from the sales of IDP publications.

The abbreviated summary of funding underscores the enormous expense associated with providing quality digitized data in an accessible form. Obviously for smaller projects aiming at a similar degree of detail, the costs will be less, but the necessary finances should be addressed in all project budgets. It must also be remembered that the IDP is a digitiziation project, not an current excavation; so its funding needs and available monetary resources differ from those of most archaeological projects discussed in the CSA Newsletter. However, the need to provide long-range funding is present in both situations. The IDP has thought about the enormous financial burdens and tapped as great a variety of funding sources as is possible. In this regard they provide an good example for all.

Conclusions

This is an incredible site. I am impressed both by the depth of thought seen at the conceptual level and its implementation to date. The complete site is envisioned as much more than the database. The pages provide a context for the artifacts and sites found therein. The pages on conservation and digitizing of the artifacts provide a forum to discuss the procedures and ethical considerations raised by all such projects. The authors of the currently available online papers are knowledgeable and share their insights into problems that they have encountered. Those papers under the Education section provide a wonderful introduction to the Silk Road; more technical papers cover the practical aspects of assembling and preserving the artifacts and knowledge accumulated about the Silk Road. Finally, papers about the creation of this online resource provide much needed knowledge of the complexity and resources required for such projects.

The implementation of the database is in its infancy. The goal of 90% of the data available in 7 years (2015) shows how much further they have to go. I found the most problems and criticisms with the database, perhaps because I know more about databases than conservation and the other areas covered within the divisions of the site. Those problems that I raised in this article are annoying but miniscule in the scheme of things. As I performed longer and more detailed searches, I found more examples of minor flaws in the data and their displays. Errors are bound to happen at the start up of any project of this size and scope. Every time that I looked closely at the database structure to try to understand the source of the problem, the database itself seemed to have the structure to resolve the problem. A list of these problems would lead to a negative impression that could not be further from the truth. The IDP breaks new ground with every step that it takes. We all benefit as much by the exposition of problems that the IDP personnel have encountered as by their actual solutions. At this stage in the implementation of the IDP's database, it is hard to distinguish between problems in the presentations and problems with the underlying database structure and a current absence of data. The results of the IDP's efforts as presented on this website are impressive. The public face of the database will improve over time and is probably not the current top priority of the individuals responsible for underlying data.

The IDP personnel have tackled a worthy project and set themselves an ambitious goal that may be impossible to reach. I personally doubt that anyone could satisfy all levels of scholarship within a single database and website, but the IDP should be commended for their approach. Today this is a fine resource, and when it is complete, it should become the premier resources for studying the Silk Road.

-- Susan C. Jones


1. There are too many articles to mention individual authors in this review, and many do not mention specific authors. However, the web site provides brief vitas for the leading figures. IDP: People, http://idp.bl.uk/pages/about_people.a4d, last site revision April, 2008; accessed 29 May 08.   Return to text.

2. The Dunhuang page from the World Heritage Centre site, http://whc.unesco.org/en/400, updated 29 May 08, visited 29 May 2008.   Return to text.

3. British Library, Online Gallery, The Silk Road, http://www.bl.uk/onlinegallery/features/silkroad/map.html, visited 29 May 2008.   Return to text.

4. I glanced at several of the German pages. They have the same organization as the English ones. The home page, however, differs slightly; it emphasizes activities at the German member institution, the Berlin-Brandenburg Academy of Sciences of the State Library of Berlin. Since I do not read the other 3 languages, I only checked that the links to these versions functioned. My version of the Foxfire browser (combined with Windows NT) does not have the ability to display the Chinese and Japanese ideographs sets used. The problem lies in a lacuna in the Unicode fonts supplied by my version of Windows since I use the default setting of UTF-8, the character set mentioned on the English homepage in Foxfire. The Cyrillic characters, which are included among the fonts I have on my PC, were displayed properly.   Return to text.

5. Search Tips section 2.2: Indexed fields have a limited list of values that can be used and in addition, "indexing using these values for all 50,000 entries on the database is only complete for type of artefact and holding institution." These limitations are acceptable for a database in progress. Search tips, http://idp.bl.uk/pages/help.a4d#2 last page revision July, 2007; accessed 15 July 08.   Return to text.

6. The order appears to be a simple alphanumeric sort on the pressmark. Return to text.

7. The database layout does contain a file that has field names indicating various languages. This file suggests that such issues have been considered at least theoretically.   Return to text.

8. IDP: Funding contains the complete list. http://idp.bl.uk/pages/about_funding.a4d, last page revision March, 2008; accessed 15 July 2008.   Return to text.


For an index of other CD and Web site reviews available on the Web pages of the CSA Newsletter, see the review index.

For other Newsletter articles concerning the use of electronic media in the humanities, consult the Subject index.

Next Article: Le mieux est l'ennemi du bien.

Table of Contents for the September, 2008, issue of the CSA Newsletter (Vol. XXI, no. 2)

Master Index Table of Contents for all CSA Newsletter issues on the Web

CSA Home Page