[Humanist] 25.850 classifying grammatical structures?
Humanist Discussion Group
willard.mccarty at mccarty.org.uk
Tue Mar 27 07:06:42 CEST 2012
Humanist Discussion Group, Vol. 25, No. 850.
Department of Digital Humanities, King's College London
Submit to: humanist at lists.digitalhumanities.org
Date: Mon, 26 Mar 2012 17:08:48 +0100
From: Tom Salyers <tom.d.salyers at gmail.com>
Subject: Classifying grammatical structures?
I'm in the middle of a thesis involving computational stylistics and
authorship attribution, and the method I'm attempting to use is based
on the distribution of the grammatical structures of sentences in the
texts. The problem I've got at the moment is how to collect and
enumerate a number of structures that's large enough to allow for
classification of my data, but small enough that individual structures
will still have statistical significance, given how modular English
For example, say that there are two sentences in my corpus that are
grammatically structurally identical except Sentence A has one
adjective ("The black cat slept") and Sentence B has two ("The old
decrepit house collapsed"). Should I count them as two distinct types,
or put them in one category labeled something like "Determiner +
non-zero number of adjectives + noun + verb"?
Does anyone have any recommendations for papers or books by anyone
who's done this sort of structural classification? My own searches
aren't turning up much, but it's entirely possible I'm not using the
right keywords. Any and all advice is appreciated. Thanks in advance!
More information about the Humanist