[Humanist] 29.772 EEBO OCR'd

Humanist Discussion Group willard.mccarty at mccarty.org.uk
Sat Mar 12 09:00:59 CET 2016


                 Humanist Discussion Group, Vol. 29, No. 772.
            Department of Digital Humanities, King's College London
                       www.digitalhumanities.org/humanist
                Submit to: humanist at lists.digitalhumanities.org

  [1]   From:    Gabriel Egan <mail at gabrielegan.com>                       (24)
        Subject: Re: [Humanist] 29.770 EEBO OCR'd

  [2]   From:    Laura Mandell <laura.mandell at gmail.com>                   (24)
        Subject: Re:  29.770 EEBO OCR'd


--[1]------------------------------------------------------------------------
        Date: Fri, 11 Mar 2016 09:00:39 +0000
        From: Gabriel Egan <mail at gabrielegan.com>
        Subject: Re: [Humanist] 29.770 EEBO OCR'd
        In-Reply-To: <20160311084544.D022A621F at digitalhumanities.org>


Aaron McCollough writes:

 > FWIW, this [Laura Mandell's posting about Typewright]
 > is an invidious characterization of the EEBO-Text Creation
 > Partnership... a project without which 18th Connect would
 > not have been feasible, and a project which has been
 > facilitating significantly the research of early modern
 > scholars for over 15 years. Nothing against Laura and her
 > project, but it shouldn't be necessary to throw shade on
 > needful work that came before in order to bring one's own
 > work into the light.

I'm sure Laura didn't mean it to sound that way, and indeed
I didn't read it so. Rather, by referring to the need to
improve the OCR of the bits of EEBO that TCP hasn't manually
keyboarded, I understood Laura to be paying TCP	an implicit
compliment. Of only, she suggests, the rest of the texts in
EEBO were as wonderfully accurate as the TCP'd ones.

I'm sure that everyone here, Laura included, thinks that TCP
represents the best of what this community has achieved in
relation to the digitization of pre-18th-century books. Those
of us who benefit from what you did, Aaron, are deeply grateful.
I know that's true of the New Oxford Shakespeare editors whose
new edition relies heavily on searches in EEBO-TCP.

Gabriel Egan
Centre for Textual Studies, De Montfort University



--[2]------------------------------------------------------------------------
        Date: Fri, 11 Mar 2016 08:22:09 -0600
        From: Laura Mandell <laura.mandell at gmail.com>
        Subject: Re:  29.770 EEBO OCR'd
        In-Reply-To: <20160311084544.D022A621F at digitalhumanities.org>


Dear Aaron and Humanist:

I'm so so sorry! When I said "these untranscribed and poorly transcribed"
documents, I meant to be referring ONLY to the OCR that we have created,
not at all to the TCP texts!

The TCP texts will not be part of the dark archive, ONLY the texts that are
not transcribed or for which the OCR is too bad to use.  That's why we are
asking people to correct the OCR'd documents.

None of the TCP texts are in TypeWright for correction, ONLY the dirty OCR
that we created.

I am sincerely sorry for the confusion, and of course, we used the Text
Creation Partnership texts as our Ground Truth.

I am revising the announcement to make it more clear that I'm referring to
the 85,000 OCR'd texts and nothing from the TCP.

Best, Laura Mandell



-- 
Laura Mandell
Director, Initiative for Digital Humanities, Media, and Culture
Professor, English
Texas A&M University
p: 979-845-8345
e: idhmc at tamu.edu
@mandellc
http://idhmc.tamu.edu





More information about the Humanist mailing list