[Humanist] 24.273 getting involved

Humanist Discussion Group willard.mccarty at mccarty.org.uk
Sun Aug 22 22:00:27 CEST 2010

                 Humanist Discussion Group, Vol. 24, No. 273.
         Centre for Computing in the Humanities, King's College London
                Submit to: humanist at lists.digitalhumanities.org

  [1]   From:    Willard McCarty <willard.mccarty at mccarty.org.uk>          (52)
        Subject: complexity; demonstrating and educating

  [2]   From:    James Rovira <jamesrovira at gmail.com>                       (7)
        Subject: Re: [Humanist] 24.270 getting involved

  [3]   From:    amsler at cs.utexas.edu                                      (49)
        Subject: Re: [Humanist] 24.270 getting involved

  [4]   From:    Peter Batke <batke_p at hotmail.com>                         (15)
        Subject: RE: getting involved

        Date: Sun, 22 Aug 2010 07:46:09 +1000
        From: Willard McCarty <willard.mccarty at mccarty.org.uk>
        Subject: complexity; demonstrating and educating

Two comments on the discussion in Humanist 24.270,
about getting involved.

First, understanding the algorithmic process step by step. This is 
possible and possibly enlightening for simple programs but, I'd think, 
not always, and for complex systems not at all. When a system becomes so 
large and intricate that its behaviour cannot be predicted, what's the 
point? I can see that understanding fundamentals of digital 
decision-making from simple examples has great benefits for a scholar 
who has never before translated some human task into software. 
Understanding how these fundamentals apply to his or her scholarly 
objects brings the lesson to life. But how about the situation, now 
commonplace, when one simply cannot follow what the software system 

Second, demonstrating (as we say) "evidence of value" to our colleagues. 
Here I am especially interested in getting help with thinking through 
the problem. It is a complicated one. And I wonder, after six decades 
why do we have this problem? Isn't it worth asking why we ask a question 
as well as immediately falling to answer it? 

I wonder what demos are for, especially why they don't work 
particularly well in the research setting, why hands-on is so crucial.

Of course we need to be better explainers. We in the humanities aren't 
practiced in the arts of explanation to non-specialists. Until recently, 
I think, we didn't see it as a priority at all (unlike the natural 
scientists, who have poured enormous and intelligent efforts into 
popularisations). But in addition to assembling an array of success 
stories to back up our claims, experience suggests to me that people 
don't get it until they can internalise the demonstrated process and 
re-imagine it in terms of their own materials. The question is not 
simply relevance but relevance to what? It often isn't sufficient to 
show, say, a concordance to a Latin text when the person to be persuaded 
works with German ones -- German, after all, works differently. I 
wouldn't assume that the Germanist is being stupid, only that his or her 
imagination hasn't been contacted. Or suppose the Germanist does see the 
point of concording. If that sort of process doesn't fit into how the 
research has been conceived, then concording might not seem relevant. 
Indeed, it might not be at all. But thinking about the project 
algorithmically from the get-go, imagining it in those terms, then 
implementing it should (I think we want to argue) be one possibility the 
Germanist has to hand. If our objective is to increase the range of 
human possibility well beyond what we individuals know how to do, then 
don't we need the help of people who think other thoughts in other ways? 
Even a demonstration of how to drive a car or ride a bicycle doesn't get 
the novice very far.


Willard McCarty, Professor humanof Humanities Computing,
King's College London, staff.cch.kcl.ac.uk/~wmccarty/;
Professor, Centre for Cultural Research, University of Western Sydney,
Editor, Humanist, www.digitalhumanities.org/humanist;
Editor, Interdisciplinary Science Reviews, www.isr-journal.org.

        Date: Sat, 21 Aug 2010 18:14:54 -0400
        From: James Rovira <jamesrovira at gmail.com>
        Subject: Re: [Humanist] 24.270 getting involved
        In-Reply-To: <20100821210601.E900364B81 at woodward.joyent.us>

Elijah -- 

I agree with your statement below to the extent that it applies to - digital - humanities as a distinct field.  A digital humanities scholar needs to know programming, yes.  But I think you go on to conflate digital humanities with all humanities.  It is simply not true that all humanities scholars need to know programming in order to be humanities scholars, even ones who rely upon digital tools for their work.  All that I really need to do my work is a library, a pad, and a pencil or pen.  

I would add that scholars working with paintings, etc., need to know how digital environments can change the appearance of visual works.  A scanned image or photograph really isn't a substitute for the real thing.  

Jim R   

>  Would you be able
>  to take that person seriously?  The analogy holds in a further manner: there is no requirement that one be fluent, merely capable of understanding the arguments and structures relayed by the text.

        Date: Sat, 21 Aug 2010 20:00:39 -0500
        From: amsler at cs.utexas.edu
        Subject: Re: [Humanist] 24.270 getting involved
        In-Reply-To: <20100821210601.E900364B81 at woodward.joyent.us>

The major reason digital humanities scholars need to understand the  
programs that provide them with analyses is that those programs can be  
operating incorrectly and the programmers may not know to check for  
the types of errors that can occur when processing humanities data;  
data which may be outside the scope of the types of programs they  
normally use for text.

Two broad classes of errors can occur. First, character representation  
errors. Ordinary text processing programs have a limited repertoire of  
characters they expect to encounter. Just adding diacritics can throw  
some programs. There are several different schemes for representing  
characters with even simple diacritics and a corpus of texts could  
contain instances of all these different schemes in the same corpus.  
Basically, the first question a digital humanist must be able to  
answer is what are the 'characters' in all of the text to be processed  
and how are they represented. Are all special characters represented  
in the same encoding? How will they sort and how will they and the  
words they form be compared for equality and lexical order. If you  
don't know the answers to these questions, you can be unexpectedly  
surprised to discover some of the words in your text were not treated  
properly, were left out, or went unrecognized.

Second, the algorithms that process text often use different methods  
of interpreting the objects they compare. Sort utilities, one of the  
most basic of programs to operate on text, can perform by 'folding  
case' or using a 'collating sequence' other than the one expected,  
especially for special characters. Sorting is a 'utility', not even a  
creative part of most applications--yet the 'sort order' of the words  
in a text can differ dependent upon what characteristics a sort  
utility has as defaults.

There may not be any way to check for problems other than to create  
special data sets to use as test cases and to check that they process  
as expected.
When sorting, programmers know to check the top and bottom of the sort  
for unexpected instances in the data which seem to rise to the top or  
descend to the very bottom.

In short, if a digital humanist scholar expects to rise to the level  
of an expert, they need to be able to test and diagnose what software  
is doing to their data. As a good doctor, they must be able to  
understand what can go wrong with prescribed courses of treatment of  
their 'patient', be able to find the errors made by technicians and  
others who carry out their instructions. If you don't know what to  
test for or how to check whether a treatment is being properly  
applied, you can't really claim to know what your results mean. You  
may not have to know how to make a thermometer, but you better know  
that there are several scales on which temperature can be recorded and  
that thermometers should all read the same temperature when in the  
same glass of water, and that if not, there is something wrong.

The universe will trick you if it can. Knowing what it can do and how  
is essential to using tools in any field; even if you don't know how  
to create the tools themselves.

        Date: Sun, 22 Aug 2010 02:47:09 +0000
        From: Peter Batke <batke_p at hotmail.com>
        Subject: RE: getting involved
        In-Reply-To: <4C705582.4080800 at mccarty.org.uk>

The quote is from S. M. Parrish, Cornell, pioneer in electronic concordance. The occasion was an after-dinner speech at an IBM conference in 1969. Woldcat under: PROCEEDINGS: COMPUTER APPLICATIONS TO PROBLEMS IN THE HUMANITIES.

"[...] I've tried to stay free of programming. I'm perfectly innocent of any knowledge of any programming language and I feel I must remain that way if I am going to continue to function as a scholar in literature; my heartfelt advice to any of you younger people embarking on this whole venture is to do the same--to ask programmers to do for you what they know how to do and what would be costly and painful for you to learn--to learn to talk their language but not to get involved in programming research or machine research. I know the best of you will not take this advice and will, therefore, break new barriers and so on, but I persist in offering it. [...]"

I really have no real sympathy for Parrish's point. I use it only as an example of attitudes about computers at a time when philologists still knew what philology was and loved the new muscle computers would give them. The notion that Parrish could not be a literary scholar and "program" explains why literary scholarship is no longer what it was. 
The authority the scholars of that time gathered through meticulous, slow and inefficient work has been diluted by electronic tools that allowed a broad range of less meticulous contemporaries (in the best sense of that word) to achieve roughly equivalent results, exact bibliographies, appropriate and complete quotations and neatly formatted papers. The lack of universal brilliance of the contemporary scholars as compared to what one sensed shining forth from the great ones of the past and near past working with slips of paper is easy to obscure in an accelerating shifts toward new perspectives and problems not before suspected.
I actually expect an atrophy of some humanistic genres as much of what once was communicated through descriptive analysis which confused, challenged, enlightened and inspired the next generation will be found wanting because of failure to have availed of algorithmic tools. I do think that computers have become awfully close friends, helpmates and collaborators to our generation. But they have also spoiled that solitary perspective for us, the perspective that Parrish still had and that we computer symbionts have lost. 
I personally have not lost I feel, but gained. I have cast my lot with the database and programming logic that extracts information. I actually find young colleagues (or old) who think they can avoid that path strangely stuck, bless them and their efforts. 
I have great hopes that the Google people and text miners in general, who not only program but think in terms of Markhov chains (and about 20-30 other techniques) for extracting information from text, (unimaginably vast quantities of text), can be brought to literature easier than the current active cohorts of literature people can be brought to mathematical modeling. Some months ago there was a thread about economics and the maturing of disciplines toward mathematization that was thought not optional in the long term -understandable resistance by those who do not have a dog in the game notwithstanding. I think for us text people, it will come through a very large back door where thousands of CS people are working on terabytes of text. They will discover aesthetics and ideas and meaning in text expressions and they will have a larger canvas than the solitary critics of previous generations who inspired us.

These uncounted mathematicians focused on text may not bother with trying to untie the knots that centuries of commentaries on commentaries have produced. They may just list the relevant component parts in indexes and move on.
cheers, Peter 		 	   		  

More information about the Humanist mailing list