[Humanist] 28.801 pubs: historical American English; participatory edn of Ulysses
Humanist Discussion Group
willard.mccarty at mccarty.org.uk
Tue Mar 10 09:59:55 CET 2015
Humanist Discussion Group, Vol. 28, No. 801.
Department of Digital Humanities, King's College London
Submit to: humanist at lists.digitalhumanities.org
 From: Mark Davies <Mark_Davies at byu.edu> (15)
Subject: COHA: Downloadable full-text data (385 million words,
 From: Amanda Visconti <amandavisconti at gmail.com> (37)
Subject: Invitation for the Open Beta of the Infinite Ulysses
Participatory Digital Edition
Date: Mon, 9 Mar 2015 14:10:27 +0000
From: Mark Davies <Mark_Davies at byu.edu>
Subject: COHA: Downloadable full-text data (385 million words, 115,000 texts)
This announcement is for those who are interested in historical corpora and who may want a large dataset to work with on their own machine. This is a real corpus, rather than just n-grams (as with the Google Books n-grams; see a comparison at http://googlebooks.byu.edu/compare-googleBooks.asp).
We are pleased to announce that the Corpus of Historical American English <http://corpus.byu.edu/coha/> (COHA) is now available in downloadable full-text format http://corpus.byu.edu/full-text/, for use on your own computer. COHA joins COCA http://corpus.byu.edu/coca/ and GloWbE<http://corpus.byu.edu/glowbe/>, which have been available in downloadable full-text format<http://corpus.byu.edu/full-text/> since March 2014.
The downloadable version of COHA contains 385 million words<http://corpus.byu.edu/full-text/coha_full_text.asp> of text in more than 115,000 separate texts http://corpus.byu.edu/full-text/coha_full_text.asp , covering fiction, popular magazines, newspaper articles, and non-fiction books from the 1810s to the 2000s.
At 385 million words in size, the downloadable COHA corpus is much larger than any other structured historical corpus of English. With this large amount of data, you can carry out many types of research<http://corpus.byu.edu/coha/files/davies_corpora_2011.pdf> that would not be possible<http://corpus.byu.edu/compare-smallCorpora.asp> with much smaller 5-10 million word historical corpora of English.
The corpus is available in several formats: sentence/paragraph, PoS-tagged and lemmatized (one word per line), and for input into a relational database. Samples http://corpus.byu.edu/full-text/samples.asp of each format (3.6 million words each) are available at the full-text website<http://corpus.byu.edu/full-text/>.
We hope that this new resource is of value to you in your research and teaching.
Professor of Linguistics / Brigham Young University
** Corpus design and use // Linguistic databases **
** Historical linguistics // Language variation **
** English, Spanish, and Portuguese **
Date: Mon, 9 Mar 2015 11:55:11 -0400
From: Amanda Visconti <amandavisconti at gmail.com>
Subject: Invitation for the Open Beta of the Infinite Ulysses Participatory Digital Edition
Today I launched a social digital edition of James Joyce's Ulysses
http://www.InfiniteUlysses.com as part of my doctoral dissertation. I'd
like to invite you to explore the site and share any feedback you might
have about your experience or how you might want to use such a text in the
Infinite Ulysses (InfiniteUlysses.com http://www.InfiniteUlysses.com ) is
a "participatory" digital edition: it uses an authoritative text (the
Modernist Version Project's transcription of the 1922 Shakespeare and Co.
first printing), but allows readers of all backgrounds to highlight the
text and add annotations (interpretations, comments, and questions) with
the goal of creating a shared space of scholars and public enthusiasts
discussing the novel. A variety of filters let you customize the
annotations you see to your needs (e.g. don't show spoilers or translations
of Latin; do show definitions, instances of intertextuality or mentions of
Hamlet, and questions from other readers).
The edition is useful in the classroom, whether as a reading supplement,
assignment ("add x annotations to the first episode of the novel"), or as a
way to prep for class (e.g. remind yourself of the kinds of questions
first-time readers will have about the novel).
You may also be interested in the site as part of a digital humanities
dissertation with a unique format and methodology: design, code, user
testing, research blogging, and a final whitepaper discussing project
outcomes. I've blogged the dissertation over the course of the project at
LiteratureGeek.com http://www.LiteratureGeek.com .
Happy to hear any feedback or answer any questions via
infiniteulysses at gmail.com!
infiniteulysses at gmail.com
@Literature_Geek http://www.twitter.com/literature_geek and
LiteratureGeek.com http://literaturegeek.com/ (research blog)
Maryland Institute for Technology in the Humanities (MITH) Winnemore
Digital Dissertation Fellow
Ph.D. Candidate, University of Maryland English Department
M.S.I. (Digital Humanities HCI Specialization), University of Michigan
School of Information
More information about the Humanist