[Humanist] 24.603 culturonomics and corpus linguistics

Humanist Discussion Group willard.mccarty at mccarty.org.uk
Sun Dec 19 10:26:50 CET 2010


                 Humanist Discussion Group, Vol. 24, No. 603.
         Centre for Computing in the Humanities, King's College London
                       www.digitalhumanities.org/humanist
                Submit to: humanist at lists.digitalhumanities.org

  [1]   From:    Finlay McCourt <finlaymccourt at gmail.com>                  (10)
        Subject: Google Labs - n-gram viewer

  [2]   From:    Mark Davies <Mark_Davies at byu.edu>                         (14)
        Subject: Culturomics / new Google Books interface / COHA


--[1]------------------------------------------------------------------------
        Date: Sat, 18 Dec 2010 12:25:44 +0000
        From: Finlay McCourt <finlaymccourt at gmail.com>
        Subject: Google Labs - n-gram viewer


I thought that this recently released tool for use with the Google books
database would be of interest to the mailing list, and hopefully be a useful
tool for some.

As they put it:

"When you enter phrases into the Google Books Ngram Viewer, it displays a
graph showing how those phrases have occurred in a corpus of books (e.g.,
"British English", "English Fiction", "French") over the selected years."

More information can be found here: http://ngrams.googlelabs.com/info

The tool itself is located here: http://ngrams.googlelabs.com

-FMcC



--[2]------------------------------------------------------------------------
        Date: Sat, 18 Dec 2010 07:18:05 -0700
        From: Mark Davies <Mark_Davies at byu.edu>
        Subject: Culturomics / new Google Books interface / COHA

Many of you have probably heard by now about "Culturomics" (http://www.culturomics.org) and the new Google Books interface (http://ngrams.googlelabs.com/), which allow you to search 500 billion words of text to see changes in the frequency of words and phrases (and thus, changes in American society and culture). It's been widely featured in newspapers and magazines for the last day or two (see http://www.culturomics.org/cultural-observatory-at-harvard/papers).

I've created a page (http://corpus.byu.edu/coha/compare-culturomics.asp) that compares Google Books / Culturomics to the new 400 million word Corpus of Historical American English [COHA] (http://corpus.byu.edu/coha/). It discusses how Google Books is nice for exact words and phrases; but of the two, COHA is the only tool that really allows you to look at a wide range of changes -- lexical, morphological, syntactic, and semantic. Anyway, for those who might be interested...

Mark Davies

============================================
Mark Davies
Professor of (Corpus) Linguistics
Brigham Young University
(phone) 801-422-9168 / (fax) 801-422-0906
Web: http://davies-linguistics.byu.edu
 
** Corpus design and use // Linguistic databases **
** Historical linguistics // Language variation **
** English, Spanish, and Portuguese **
============================================




More information about the Humanist mailing list