[Humanist] 23.762 skills for humanities computing

Humanist Discussion Group willard.mccarty at mccarty.org.uk
Tue Apr 13 17:06:45 CEST 2010

                 Humanist Discussion Group, Vol. 23, No. 762.
         Centre for Computing in the Humanities, King's College London
                Submit to: humanist at lists.digitalhumanities.org

        Date: Sun, 11 Apr 2010 22:50:26 -0600
        From: Mark Davies <Mark_Davies at byu.edu>
        Subject: RE: [Humanist] 23.760 skills for humanities computing
        In-Reply-To: <20100410125446.C71FA54589 at woodward.joyent.us>

>> There is one absolute requirement for Humanities Computing: you must
belong to the TEI. You don't necessarily need to know XML (although
it might be good if you understood how it works) but you have to know
the standard software used to process it (when I say standard, I mean
software commonly used by TEI members and generally open source).

I guess it depends on how one defines "humanities computing" and whether corpus linguistics is part of it (I like to think that it is). For large 100-400 million word corpora (like those at corpus.byu.edu), TEI and XML are often more of a hindrance than a help. While TEI and XML might work nicely for small 1-10 million word collections of texts, they are simply not scalable enough and are far too cumbersome for "industrial strength" corpora, where relational databases are the norm (SQL Server, mySQL, or architectures like Corpus Workbench, Sketch Engine, BNCweb, etc; which are based on underlying RDBMs). 

Mark Davies

Mark Davies
Professor of (Corpus) Linguistics
Brigham Young University
(phone) 801-422-9168 / (fax) 801-422-0906
Web: http://davies-linguistics.byu.edu

** Corpus design and use // Linguistic databases **
** Historical linguistics // Language variation **
** English, Spanish, and Portuguese **

More information about the Humanist mailing list