[Humanist] 22.604 reliability of digital texts

Humanist Discussion Group willard.mccarty at mccarty.org.uk
Tue Mar 10 08:26:22 CET 2009

                 Humanist Discussion Group, Vol. 22, No. 604.
         Centre for Computing in the Humanities, King's College London
                Submit to: humanist at lists.digitalhumanities.org

        Date: Mon, 09 Mar 2009 06:08:41 -0300
        From: alckmar at cce.ufsc.br
        Subject: Re: [Humanist] 22.598 revolutionary suggestions; reliability ofdigital text
        In-Reply-To: <20090307084303.33C7D2EAAD at woodward.joyent.us>


In fact, some countries, governments, and national libraries have  
developed or financed projects on digital preservation of literary  
texts, in image format. However, I am concerned about reliability of  
literary texts when they are in text format, I mean, when they have  
passed through an OCR mechanism. In internet, we have a great amount  
of text-formatted literary works, provided by particulars, libraries,  
research groups, etc. All of them play an important role in preserving  
literary heritage, but we must find some guidelines that will be able  
to indicate the quality, the accuracy, the reliability of the  
digitalized texts they make available. It will be important to them,  
of course, but mostly to the e-readers.

Best regards,


Humanist Discussion Group <willard.mccarty at mccarty.org.uk> ha escrito:

>                  Humanist Discussion Group, Vol. 22, No. 598.
>          Centre for Computing in the Humanities, King's College London
>                        www.digitalhumanities.org/humanist
>                 Submit to: humanist at lists.digitalhumanities.org
>   [1]   From:    Steven Totosy <clcweb at purdue.edu>                    
>        (40)
>         >
>   [2]   From:    Joris van Zundert <joris.van.zundert at gmail.com>      
>       (153)
>         Subject: Re: [Humanist] 22.597 how have we acted on revolutionary
>                 suggestions?
>   [3]   From:    Wesley Raabe <wesley_raabe at yahoo.com>                
>        (14)
>         Subject: Reliability of Digitized Text
> --[1]------------------------------------------------------------------------
>         Date: Fri, 6 Mar 2009 14:32:26 +0800
>         From: Steven Totosy <clcweb at purdue.edu>
>         Subject: totosy Re: [Humanist] 22.596 reliability of digitized texts?
>         In-Reply-To: <20090306061624.BE0252EE96 at woodward.joyent.us>
> greetings alckmar: what do you mean by "reliability"? if you mean
> their preservation, libraries in virtually all countries -- in
> addition to national and international agencies -- are working on
> software to preserve electronic texts. the first country to establish
> a national repository of electronic material was canada in 1998, in
> its national library (later renamed library and archives canada) and
> today most countries have such, either via a centralized agency or in
> private institutional structures/ventures such as the US although the
> library of congress has started to do some of it. if you mean the
> stability of URLs to access electronic material, that issue is
> evolving, too, although by now there are several methods to maintain
> the stability of the URL. hope this helps, best, steven totosy
> (editor, clcweb)
> On Mar 6, 2009, at 2:16 PM, Humanist Discussion Group wrote:
>>                 Humanist Discussion Group, Vol. 22, No. 596.
>>         Centre for Computing in the Humanities, King's College London
>>                       www.digitalhumanities.org/humanist
>>                Submit to: humanist at lists.digitalhumanities.org
>>        Date: Thu, 05 Mar 2009 05:03:50 -0300
>>        From: alckmar at cce.ufsc.br
>>        >
>> I am interested in reliability of digitalized literary texts. Do you
>> know people who would be working on it?
>> Best regards,
>> Prof. Alckmar Luiz dos Santos
>> Universidade Federal de Santa Catarina - CNPq/Brazil
>> Invited researcher at Universidad Complutense de Madrid
> --[2]------------------------------------------------------------------------
>         Date: Fri, 6 Mar 2009 11:50:39 +0100
>         From: Joris van Zundert <joris.van.zundert at gmail.com>
>         Subject: Re: [Humanist] 22.597 how have we acted on   
> revolutionary suggestions?
> Dear Willard,
> I moved June '08. I had a choice to make about the printers.
> They didn't make it. I hadn't printed anything in two years.
> (Except for one letter that obligatory needed to be physically delivered -
> cause of the signature. The letter by the way, was sent to town hall. There
> - I know this because my wife is working there - all 'snail mail' is scanned
> and electronically distributed.)
> I read all articles and policy stuff from screen. My inbox and desktop icons
> remind me of what I think I should read. The breakthrough of ebooks has been
> proclaimed several times in vain in he past, so I should be careful. But
> nevertheless I'd say the pending arrival of coloured, non back-lit
> electronic paper will increase the use of electronic texts. Not for novels
> of course, as publishers seem to think, but for all other paperwork.
> Novels, that's another thing, another setting, another sensation altogether.
> There's a kind of pleasurable physicalness to novels I as a reader value -
> the physicalness is part of the engagement with the novel. But that's from
> me: one who's first encounter with electronics was Pong and who has used
> cassette recorders and dial phones; but only has four vinyl records because
> the CD-player had arrived. People younger than me are far heavier users of
> Twitter and Chat than me. So it seems media-use is shifting with
> generations. So, are people < 30 yrs. in general book users? Any surveys
> known?
> But Willard, what strikes me most in your elaborated question is the tone of
> casual obviousness in "In both cases we can be very glad the suggestions
> were not acted on [...]".
> Why's that? Except for the fact that it might not have been a very practical
> medium these days, there seems to me nothing wrong with the intended
> transformation of the texts. Do we need all the books as books? I'd bet the
> majority of text does not need fancy covers, high grade paper and eye candy
> typefacing. That which serves it's intended practical purpose within a few
> years of writing, might even be better off not appearing in print, but just
> in digital form so it can be circulated quite a lot more efficient.
> Evolution of digital texts will select what appears to be valuable and might
> even be put in print - cause in the end print is more durable than any form
> of bits and bytes.
> I think this. Where a text has a practical purpose, a direct and concrete
> application for a short term (short term as in 1 to 10 years), it will
> eventually only arrive and be used in digital form. Where text has an
> aesthetic function (as well), it will be put and used in both digital and
> physical form. More and more, making a text physical will be a statement for
> it being more than just any text, something worth for prosperity.
> Effectively this will turn libraries into musea (if they are not already).
> The physical book can thus be compared to the painting. For all practical
> purposes I can do with the facsimile, either in print or digital form -
> probably digital. For some purposes you'd have to go the museum, or buy the
> thing yourself.
> So sorry to answer your 'jumble of anecdotes' with a 'jumble of thoughts',
> but you had me going there...
> And the printer? Well actually it's back with a vengeance. As long as
> electronic full color paper of 600+ dpi does not exist, I need some other
> way to appreciate my lucky shot photo's in all their splendor. And my
> daughter needs a coloring picture once in a while.
> y.s.,
> Joris van Zundert
> PS Just struck by a counter example. Although its publisher does all he can
> to get me to read the virtual one, I still only read the paper because I'm
> remembered of its existence by it lying on the doormat every morning. But
> this also seems to be a dying off tradition of 30+ people.
> --
> Mr. Joris J. van Zundert, MA
> Dept. of Software R & D
> Huygens Institute
> Royal Netherlands Academy of Arts and Sciences
> Contact information can be found at
> http://www.huygensinstituut.knaw.nl/index.php?option=com_content&task=view&id=222&Itemid=125&lang=en
> --[3]------------------------------------------------------------------------
>         Date: Fri, 6 Mar 2009 08:14:01 -0800 (PST)
>         From: Wesley Raabe <wesley_raabe at yahoo.com>
>         Subject: Reliability of Digitized Text
>         In-Reply-To: <20090306061700.A00E42EEE8 at woodward.joyent.us>
> Although it is common to ask whether digital texts are reliable, I   
> believe that most general answers obscure more than they clarify.   
> Some digital texts can be relied on for certain purposes. For   
> example, when I am seeking a passage in a work that I've read, so   
> long as I have a few words in memory it's much faster to search   
> digital texts than thumb through books on my shelf or run down to   
> the library. But if I wanted to quote for a scholarly submission, it  
>  would depend whether my interest is the text as printed in a   
> particular print copy or as published on an electronic archive.   
> Either may be interesting.
> Just as early editions are more likely to be authoritative than   
> later reprints and reprints published by university presses under   
> the editorship of a scholar are more likely to be authoritative than  
>  cheap paperbacks, works published in scholarly digital archives are  
>  more likely to be authoritative than mass digitization projects.   
> Even then, general rules may always be wrong in particular instances.
> But I find that the interaction between print and digital forms can   
> be quite entertaining. Kenneth Lynn's Harvard edition of Uncle Tom's  
>  Cabin (1962) was recently digitized by Google. One of the   
> characteristics of optical character recognition software, that the   
> character sequence "rm" may become "nu," is this humorous misquote   
> (accessed 6 March 2009):
> "O, there V Mammy!" said Eva, as she flew across the
> room; and, throwing herself into her anus, she kissed her ipeutedly.
> The line, while obviously faulty, does have a certain resonance if   
> one is familiar with almost any copy of the book. At this chapter's   
> conclusion, another character follows the path Eva takes in the   
> digital version (again, from Google Books).
> " I put this lady under your care; she is tired, and wants rest;   
> take her to her chamber, and be sure she is made comfortable;" and   
> Miss Ophelia disappeared in the rear of Mammy.
> It is my hope that some well-meaning editor will reprint the   
> standard edition of UTC in a new paperback and borrow the Google   
> Books version as the base text. May a critic, who trusts the reprint  
>  text, cite it. And then I have a title for a brief article: "The   
> Anal Aesthetics of Uncle Tom's Cabin: Into Mammy's Rear."
> I recommend for most general purposes Lisa Spiro's blog post on the   
> same topic:
> http://digitalscholarship.wordpress.com/2008/05/09/evaluating-the-quality-of-electronic-texts/ I have also prepared a more serious statement about the "Reliability of Electronic Texts" on my blog. http://wraabe.wordpress.com/reliability-of-electronic-texts/ Any serious answer must grapple with the fact that print and digital textuality are thoroughly intermingled in our own time, not only in the mind of the readers but also in the constitution of new printed   
> texts.
> But, if a general rule is sought, texts are prepared, read, and   
> processed by humans or human agents, and humans have different uses   
> for texts. Reliability is proportional to the distance between the   
> purpose for which the text was prepared and the purpose to which you  
>  would put it.
> Wesley Raabe
> Assistant Professor
> Kent State University
> _______________________________________________
> List posts to: humanist at lists.digitalhumanities.org
> List info and archives at at: http://digitalhumanities.org/humanist
> Listmember interface at:   
> http://digitalhumanities.org/humanist/Restricted/listmember_interface.php
> Subscribe at: http://www.digitalhumanities.org/humanist/membership_form.php

This message was sent using IMP, the Internet Messaging Program.

More information about the Humanist mailing list