[Humanist] 28.447 Big Data no boondoggle

Humanist Discussion Group willard.mccarty at mccarty.org.uk
Wed Oct 29 09:01:29 CET 2014

                 Humanist Discussion Group, Vol. 28, No. 447.
            Department of Digital Humanities, King's College London
                Submit to: humanist at lists.digitalhumanities.org

  [1]   From:    Willard McCarty <willard.mccarty at mccarty.org.uk>          (39)
        Subject: Big Data thresholds and tolerances

  [2]   From:    maurizio lana <maurizio.lana at gmail.com>                  (108)
        Subject: Re: [Humanist] 28.446 Big Data no boondoggle

  [3]   From:    Alan D Corre <corre at uwm.edu>                               (7)
        Subject: Re:  28.440 Big Data no boondoggle

        Date: Tue, 28 Oct 2014 08:23:14 +0000
        From: Willard McCarty <willard.mccarty at mccarty.org.uk>
        Subject: Big Data thresholds and tolerances

Dear Joris,

You say that,

> .... If you are talking just about how
> humanities faces big data because of things like JSTOR, then the
> epistemological problem is not new, right? We always knew there were far
> more journal articles, monographs, information, and data out there than we
> would ever be able to find and to gauge. Maybe digital archives just put
> that problem more clearly in our face. That again is an effect of what we
> call tongue-in-cheek Daniel O'Donnel's first law of computing: problems are
> not so much created through computing, but they are magnified manifold by
> it.

We could say, and I would agree, that *in principle* the problem is very 
old. But that's not how we used to think. When I wrote my dissertation 
on Milton's Paradise Lost in its relation to biblical and classical 
literature (late 1970s-early 1980s), it was still assumed that I would 
read everything that had been written on that topic, e.g. all criticism 
up to that time, all the major works of Greek and Latin literature, all 
of Augustine and so on. I did read quite a bit but not all. I did 
actually finish the thing, though it took me 8 years. I would assume 
that nowadays if anyone at all works on Milton no such assumption is made.

There are thresholds past which different things happen. The problem I 
was really thinking of was, however, not merely the known or estimable 
volume of relevant literature but the ease with which I can find out 
about and obtain items. The failure of mechanisms for retrieving items 
with the best precision/recall ratio in combination with natural 
curiosity in combination with that ease is the difference that has made 
a difference. A threshold has been reached, I have crossed it and 
nothing will ever be the same again.

The microscope (or I should say all kinds of microscopes) only magnify. 
You could say all that stuff has always been there, so what's the big 
deal? I'd say, go read Hacking's "Do we see through a microscope?" and 
then think again. So I'd argue that yes, we do have a new 
epistemological problem, at least in practice.


Willard McCarty (www.mccarty.org.uk/), Professor, Department of Digital
Humanities, King's College London, and Digital Humanities Research
Group, University of Western Sydney

        Date: Tue, 28 Oct 2014 14:37:56 +0100
        From: maurizio lana <maurizio.lana at gmail.com>
        Subject: Re: [Humanist] 28.446 Big Data no boondoggle
        In-Reply-To: <20141028075036.BA170798C at digitalhumanities.org>

there's a blaze of light in every word
it doesn't matter which you heard
the holy or the broken Hallelujah
		l.cohen, hallelujah

il corso di informatica umanistica: http://www.youtube.com/watch?v=85JsyJw2zuw
la biblioteca digitale del latino tardo: http://www.digiliblt.unipmn.it/
a day in the life of DH2013: http://dayofdh2013.matrix.msu.edu/digiliblt/
che cosa sono le digital humanities: http://www.youtube.com/watch?v=4JqLst_VKCA
Maurizio Lana
Università  del Piemonte Orientale, Dipartimento di Studi Umanistici
piazza Roma 36 - 13100 Vercelli
tel. +39 347 7370925

        Date: Wed, 29 Oct 2014 01:02:03 +0000
        From: Alan D Corre <corre at uwm.edu>
        Subject: Re:  28.440 Big Data no boondoggle
        In-Reply-To: <20141028075036.BA170798C at digitalhumanities.org>


For many centuries, in many diverse cultures, big data resided in the minds of scholars. Memorization was not despised, on the contrary it was admired. It seems to me that modern science, whether accidentally or deliberately, demoted memorization, because corpora of big data acquired an authority which was not to be argued with. Copernicus, Galileo et al learned how dangerous it was to question received authority. Anyway, here are some examples of big data from the Near East and at home with which I am familiar.

To this day many Muslim boys memorize the entire Quran in their childhood, and it is considered an achievement worthy of celebration. This means that the reader of Arabic poetry immediately will spot a reference to the sacred book, and enjoy his find. There were many who memorized professionally. I recall reading about a man who could recite a thousand poems beginning with the letter alif, i.e. this was just the beginning of his corpus! An outstanding scholar like Avicenna wrote 450 books on astronomy, alchemy, geography, geology, psychology, logic, mathematics, physics, poetry and theology. He probably had a good library there in Uzbekistan, but he had no access to the Internet, and must have internalized a vast store of information. Islam also has the Hadith, a vast corpus of sayings ascribed to the Prophet, of which various collections have been made. Each will have an isnad, a chain of the authorities who handed the tradition down.

In the Jewish world, the Pharisaic tradition, which is the source of orthodox Judaism, possesses a vast corpus of Oral Law and Midrash, the "sea" of the Talmud which was originally entirely oral, and only came to be written down when it was feared that it might be forgotten. Anciently, there were men who would professionally memorize the tradition, and quote it on request to the students. Many scholars of this tradition, who study in the Talmudic academies or Yeshivot, have an intimate knowledge of this huge corpus. I was told on good authority that Rabbi Koppel Kahana (Kagan), author of several books on Jewish and Western Law, would allow students to stick a pin through a random page of the Babylonian Talmud, (in which the pagination is standard in all printings) and he would then declare the word on the other side of the page. The traditional reading of the Law of Moses, which occurs regularly in all synagogues, is from a consonantal Hebrew scroll, which has no vowels or neumes which define the cantillation, as are found in the printed text. Any deviation is pounced on by the congregation, and must be read correctly.

Nearer home, Edith Sitwell in her delightful book English Eccentrics tells of an Oxford professor (whose name escapes me) who knew the Latin and Greek classics by heart. On one occasion he was traveling in a stage coach where a young man was trying to impress some ladies by quoting Sophocles. The professor pulled out from his cloak a miniature edition of the complete works of the author, and said: "Young man, I do not recall that quotation. Pray show it to me." The deflated young man shouted to the driver: "Stop the coach. I want to get out."

At Haberdashers' school in London in the 1940's I was obliged to learn 14 lines of prose or poetry weekly. It was not a bad idea.

Alan Corré, Emeritus Professor of Hebrew Studies, University of Wisconsin-Milwaukee

More information about the Humanist mailing list