[Humanist] 27.340 comparing corpora

Humanist Discussion Group willard.mccarty at mccarty.org.uk
Sat Sep 14 12:08:18 CEST 2013


                 Humanist Discussion Group, Vol. 27, No. 340.
            Department of Digital Humanities, King's College London
                       www.digitalhumanities.org/humanist
                Submit to: humanist at lists.digitalhumanities.org



        Date: Fri, 13 Sep 2013 12:22:24 -0400
        From: Alex Gil <colibri.alex at gmail.com>
        Subject: Re:  27.333 comparing corpora; farewell to the DHO
        In-Reply-To: <20130913074232.465DC2D78 at digitalhumanities.org>


Hi Hartmut, what do you mean by comparing?

Are you looking for differences in almost similar texts?:
http://www.juxtacommons.org/home/index
http://collatex.sourceforge.net/

Are you looking for clusters of repetition?
http://artfl-project.uchicago.edu/content/pair
http://superfastmatch.org/#1

Stylo is for stylometrics, (I'm not sure if it can do cluster recognition
and output it in a way that's useful so I apologize in advance), which
allows you to study the similarities in style in a corpus, not necessarily
textual sequences or lack-thereof.

Best,
Alex.





More information about the Humanist mailing list