Vol. XXII, No. 3
CSA Newsletter Logo
January, 2010

Cloudy Skies?

Harrison Eiteljorg, II
(See email contacts page for the author's email address.)

"Cloud computing" is all the rage. And why not? The promise of letting more and more of the work and data storage surrounding computer work be done by experts off-site is compelling. The people doing such work off-site are surely, by virtue of economies of scale, far better equipped to deal with all the problems associated with software and data storage. They will have available to them, on demand, a level of computer expertise that few individual businesses, academic institutions, or projects could hope to have constantly at the ready.

Surely this is a prototypical no-brainer. It has been treated as such in the computer press, and more and more companies have been offering some form of remote computing, beginning most publicly with Picasa, the Google® photo service that offered most standard photo services, including storage of the images, at no charge and without clogging the user's computer with large image files. Google's Gmail system followed, though it was less unique, given free email services already extant. With both systems, the servers were maintained by Google, and the user was free to use the services as he/she desired. Building on those early examples, Google now offers a host of applications including word processing and spreadsheets, and Google is but one of many companies offering online data storage.

Not everyone is so enthused; for instance, see the October 26, 2009, column entitled "The cloud party will be hopping in 2010 -- leading to a hangover in 2011" by David Linthicum at InfoWorld's online site (posted 10/26/09 and last accessed 01/21/10 at www.infoworld.com/d/cloud-computing/cloud-party-will-be-hopping-in-2010-leading-hangover-in-2011-043?source=IFWNLE_nlt_cloud_2009-10-26). Another set of concerns is expressed by Zack Urlocker in his November 20, 2009, piece entitled "Will it be a winner-take-all market for the could?", also at InfoWorld (posted 11/23/09 and last accessed 01/21/10 at www.infoworld.com/d/open-source/will-it-be-winner-take-all-market-cloud-222?source=IFWNLE_nlt_cloud_2009-11-23).

There are some specific concerns, and Google has made them very apparent. The Google mail server has, for instance, gone down more than once in the last few months. In addition and far more important for this discussion, there is at least one confirmed story of Google-stored data taken down without warning, explanation, or any mechanism even to speak to or correspond directly with someone at Google in order to discuss the problem. (The vagueness of the foregoing stems from the victim's natural concern that a public airing of the problem could have undesirable consequences.) Google, of course, is in a different category than other suppliers since its offerings are generally free of charge. There is certainly a difference between what one expects -- perhaps demands -- when one has paid for a service than when it has been free.

Even when paying for online storage, however, there are risks, and no amount of money can replace data simply lost. In CSA's case, for instance, the online web hosting company that was used at one time backed up our online web pages but not our email. On more than one occasion the system went down, and only good fortune and a quick recognition of the nature of the problem prevented serious losses. (Mail on the CSA office computer is now backed up, on a regular, though not daily, basis. It is copied onto a separate hard disk, and that hard disk is actually two identical ones, each containing the same files, so that a catastrophic loss is all but impossible.) Even now, despite the fact that web data files are backed up, the entire website is regularly compressed into a single file and stored on CSA's computer, not just on the web host.

Is such care a sign of paranoia, senility, or just old-fashioned, unreasoning fear? Probably a bit of each. I have rarely lost files due to a hardware malfunction, very rarely. However, there were times in the early days of my work with computers when serious issues arose. In addition, I think it only reasonable to protect irreplaceable data with all possible care. Data that cannot be replaced should not be at risk.

A more recent problem with cloud computing confirmed the potential for problems: "Microsoft snafu calls into question its cloud reliability." Writing in InfoWorld's online site, Bill Snyder, discussed Microsoft's stumble with the cloud (posted on 01/07/10 and last accessed on 01/21/10 at www.infoworld.com/t/cloud-computing/microsoft-snafu-calls-question-its-cloud-reliability-513?source=IFWNLE_nlt_daily_2010-01-07).

If it were difficult to store data files on local computers, there might be some reason to take the risks involved in off-site storage. Indeed, it may be argued that extremely large files may be safer on off-site computers because they can be so difficult to deal with. On the other hand, the newest of desktop computers often come with hard drives of one terabyte. On balance, it seems that off-site data storage is a risk that should only be taken for good and urgent reasons -- and after determining fully the policies and practices of the off-site system.

What about web-based software? Should one similarly worry about using web-based software supplied by one or another service company? My immediate reaction is to be less concerned. If I were to use a CAD or database management program supplied by Google or another service-provider who charged for the service, what would be the downside? I would, presumably, take the usual care to be sure about the utility of the software for my needs. The question that haunts me is the one of file format. Google can import many word-processing formats into their own and even export into many as well, but I have been unable to find any information about their own file format. Since exact translations of things like non-ASCII characters, formatting choices such as boldface and Italics, outline-style numbering, and so on are so difficult, I would not want to count on exact correspondence between any two formats when the time comes to print the text -- or put it into PDF form. More critical -- and I admit that the cynic in me emerges here -- I do not want to be dependent upon Google or anyone else so thoroughly that, should they change something, I am unable to reject the change or otherwise prevent damage to files that are, after all, mine. Add to that the fact that, even if I copy a file from the Google server onto my desktop, what can I do with it if the format is proprietary?

Perhaps another short story relating a recent experience will help to illustrate my concern. I uploaded a family photograph to one of the sites (that will remain unnamed here) that offer quick and inexpensive photo printing. I ordered prints, and they arrived with the head of our youngest son partially missing. Checking with him first to be sure I had not missed a serious injury involving a partial lobotomy, I objected to the service. My objection was, not surprisingly, fielded by someone in another country on another continent, and that woman kept trying to advise me to have the prints made as 8 x 10 photos instead of 4 x 6. Since the photos in question were going out in holiday cards, this seemed a poor choice, and, my frustration finally becoming unbearable, I asked for and was connected to a supervisor. The supervisor admittted that this company could not print to the edge of an image -- any image -- at any time. That is, a good photograph, one that uses the entire frame so as to maintain maximum sharpness when enlarged, could not be printed without lopping off the top, bottom, and both sides! Had I uploaded the image, manipulated it and relied upon the site's storage to maintain it, what would I have chosen to do next? I do not know if I could have downloaded the image in a format that would have permitted me to use it elsewhere. I did not bother to ask such a question because, of course, I still had the image -- the one prior to my son's unfortunate lobotomy. As silly as this example may seem, this is just the kind of concern that suggests to me at least that I should retain as much control as possible of the tools I use for important data as well as the files that I create. To entrust critical matters to others who have no reason to understand their content or importance does not seem reasonable.

There is yet another, quite separate issue about cloud computing: data interoperability. That is, as discussed at length by David Linthicum in an InfoWorld article posted 1/12/10 and last accessed 01/21/10 at www.infoworld.com/d/cloud-computing/data-interoperability-challenge-cloud-computing-259?source=IFWNLE_nlt_cloud_2010-01-18, "The data interoperability challenge for cloud computing." In short, the issue revolves around making sure that data from one system can be transferred to and used by another. This is closely related to the issue of file format, but data may be transferred and used elsewhere even when the file format in which the data items are stored is proprietary. The potential consequence, however, is similar. If data are stored by a subcontractor, what control does the owner have?

All these arguments seem not to have impressed the federal Office of Management and Budget. David Linthicum, again writing in InfoWorld, "The feds may soon mandate cloud computing usage," this time an article posted on 1/20/10 and accessed 1/26/10 at www.infoworld.com/d/cloud-computing/feds-may-soon-mandate-cloud-computing-usage-501?source=IFWNLE_nlt_cloud_2010-01-25 discusses plans for requiring faster-than-might-be-optimal movement to the cloud by government agencies.

I must conclude by acknowledging that, at some point in the future, cloud computing will probably win the day, and I, too, will use online software and online file storage. My concern, and the reason for writing this piece, is that those of us in charge of important and irreplaceable data should take care to avoid leaping too quickly into that brave new world. "The cloud" will not go away or leave us in the metaphorical dust if we are not early adopters, but it could inflict enormous harm if we are too-early adopters. People do not refer to technology's "bleeding edge" without reason.

-- Harrison Eiteljorg, II

An index by subject for all CSA Newsletter issues may be found at csanet.org/newsletter/nlxref.html; included there are listings for articles concerning the use of electronic media in the humanities.

Next Article: Using Old Data in New Ways

Table of Contents for the January, 2010, issue of the CSA Newsletter (Vol. XXII, No. 3)

Master Index Table of Contents for all CSA Newsletter issues on the Web

CSA Home Page