[Humanist] 29.692 big vs small

Humanist Discussion Group willard.mccarty at mccarty.org.uk
Sat Feb 6 09:37:05 CET 2016


                 Humanist Discussion Group, Vol. 29, No. 692.
            Department of Digital Humanities, King's College London
                       www.digitalhumanities.org/humanist
                Submit to: humanist at lists.digitalhumanities.org

  [1]   From:    Michael Falk <michaelgfalk at gmail.com>                     (64)
        Subject: Re:  29.688 big vs small? (a naive question)

  [2]   From:    "Center for Comparative Studies"                           (8)
                <centrostudicomparati at libero.it>
        Subject: Re:  29.688 big vs small? (a naive question)

  [3]   From:    Bill Kretzschmar <kretzsch at uga.edu>                       (10)
        Subject: Re:  29.688 big vs small? (a naive question)

  [4]   From:    "Allen B. Riddell" <abr at ariddell.org>                     (15)
        Subject: Re: [Humanist] 29.688 big vs small? (a naive question)


--[1]------------------------------------------------------------------------
        Date: Fri, 05 Feb 2016 07:08:43 +0000
        From: Michael Falk <michaelgfalk at gmail.com>
        Subject: Re:  29.688 big vs small? (a naive question)
        In-Reply-To: <20160205063251.87AB37EF0 at digitalhumanities.org>


I think Paul Feyerabend asked the right question: "How many constellations
are in the sky?"

You certainly see things on the large scale (e.g. constellations) that you
can't see on the small (e.g. the sun). But all Big Data is essentially just
counting, and you have to know what you're counting before you can count it.

I often have this thought when I read papers about genre. No-one seems to
agree about the generic designation of any classic novel, and lots of smart
people think that every text is a hybrid. Is *Emma* a domestic novel, a
courtship novel, or a *Bildungsroman*? Are the "male" and "female"
*Bildungsroman* the same thing, or different? On the one hand, perhaps
distant reading, by taking us away from these contentious issues, might
expose new and different relationships between texts. On the other hand, if
you want to look at generic trends over time, you probably need to mark
texts up with a genre, and this conversion of text to data might commit you
to a position on a contentious debate without your knowledge.

Close and distant reading depend on one another, and the only solution for
digital humanists is the old one, of shuttling between particular
observations and large generalisations, of sharing knowledge and expertise,
of never assuming that a particular question or a particular method is the
"right" one.

Cheers,

Michael

On Fri, Feb 5, 2016 at 6:32 AM Humanist Discussion Group <
willard.mccarty at mccarty.org.uk> wrote:

>                  Humanist Discussion Group, Vol. 29, No. 688.
>             Department of Digital Humanities, King's College London
>                        www.digitalhumanities.org/humanist
>                 Submit to: humanist at lists.digitalhumanities.org
>
>
>
>         Date: Thu, 4 Feb 2016 09:12:11 +0000
>         From: Willard McCarty <willard.mccarty at mccarty.org.uk>
>         Subject: a naive question
>
>
> Let me ask a naive question: if scale matters fundamentally, in the sense
> that larger aggregates of things manifest new properties not observed in
> smaller ones, then how do we know what if anything not observable in the
> smaller but seen in the larger is relevant to the smaller? Does the
> difference made by scale imply that Big Data is a realm of its own? -- that
> (with apologies to Jason Ensor) distant reading means getting close to
> *something else*, with problematic connection to any individual text? How
> are so-called bigger pictures connected to smaller ones? Are the two
> incommensurable?
>
> (My question, by the way, was triggered by P. W. Anderson's
> "Moreisdifferent", Science NS 177.4047 (1972): 393-6.)
>
> Yours,WM
> --
> Willard McCarty (www.mccarty.org.uk/), Professor, Department of Digital
> Humanities, King's College London, and Digital Humanities Research
> Group, Western Sydney University

-- 
Michael Falk
Assistant Lecturer
PhD Student
School of English
University of Kent, UK

t: @walk_the_falk
e: M.G.Falk at kent.ac.uk / michaelgfalk at gmail.com
m: +44 7592 524215 / +61 405 383 276
skype: michaelgfalk



--[2]------------------------------------------------------------------------
        Date: Fri, 5 Feb 2016 12:13:05 +0100
        From: "Center for Comparative Studies" <centrostudicomparati at libero.it>
        Subject: Re:  29.688 big vs small? (a naive question)
        In-Reply-To: <20160205063251.87AB37EF0 at digitalhumanities.org>


I think this misunderstanding springs from the troubles in the application 
of statistical law, usually observed in large corpora of data, to smaller 
corpora. But it is probably a perspective mistake. Is the force of gravity 
active only among big bodies?
Francesco

--[3]------------------------------------------------------------------------
        Date: Fri, 5 Feb 2016 15:16:06 +0000
        From: Bill Kretzschmar <kretzsch at uga.edu>
        Subject: Re:  29.688 big vs small? (a naive question)
        In-Reply-To: <20160205063251.87AB37EF0 at digitalhumanities.org>


Al least for Big Data in human populations, such as collections of language, it is hard to associate big and small. Horvath and Horvath say with regard to language that we should not expect to apply generalizations at higher levels of scale to lower levels of scale (called the “ecological fallacy” by the Horvaths), and we should not expect any individual fairly to represent the behavior of a locality, or any locality fairly to represent the behavior of a region (called the “individual fallacy” by the Horvaths). Within complex systems, as I have written about them with regard to language (in *Language and Complex Systems*, where you can find the Horvath references), we know that the scaling property produces the same distributional pattern, an asymptotic hyperbolic curve for frequencies of alternative possibilities (like the words "thundershower," "thunder and lightning," and a great many others to refer to a thunderstorm), at all levels of scale, but as the Horvaths have said, we cannot expect to find exactly the same components at exactly the same frequencies at different levels of scale. So, big and small are likely to be different in data related to complex systems (language, economic markets, your immune system, evolutionary biology, quanta in physics). The same quantitative fractal pattern will be present, but what's most common and what's least common are most likely different in big and small samples.   The same is not true for data that is normally distributed (in a statistical sense), like people's heights in a population: Big Data and smaller collections are likely to tell you the same thing. 

Bill
__________________________________________________
Bill Kretzschmar
Harry and Jane Willson Professor in Humanities
Dept of English, Park 317, Univ of Georgia, Athens, GA  30602
Tel: 706-542-2246     www.lap.uga.edu


--[4]------------------------------------------------------------------------
        Date: Fri, 5 Feb 2016 15:02:43 -0500
        From: "Allen B. Riddell" <abr at ariddell.org>
        Subject: Re: [Humanist] 29.688 big vs small? (a naive question)
        In-Reply-To: <20160205063251.87AB37EF0 at digitalhumanities.org>

Dear Willard,

Scale does not "matter fundamentally". There are advantages that
often---but not invariably---accompany the use of larger samples. One
advantage of large random samples is that one tends to gains a better
appreciation for the diversity in a population (of people, artistic
words, etc). Small samples tend to be less helpful as guides to the
diversity present in a population.

Best wishes,

Allen Riddell

--
Allen Riddell
Neukom Fellow
Neukom Institute
Dartmouth College
abr at ariddell.org





More information about the Humanist mailing list