[Humanist] 26.586 XML & scholarship

Humanist Discussion Group willard.mccarty at mccarty.org.uk
Sat Dec 15 11:52:08 CET 2012

                 Humanist Discussion Group, Vol. 26, No. 586.
            Department of Digital Humanities, King's College London
                Submit to: humanist at lists.digitalhumanities.org

  [1]   From:    Desmond Schmidt <desmond.allan.schmidt at gmail.com>          (1)
        Subject: Re:  26.581 XML & scholarship

  [2]   From:    Wendell Piez <wapiez at wendellpiez.com>                     (37)
        Subject: Re: [Humanist] 26.581 XML & scholarship

        Date: Fri, 14 Dec 2012 22:00:39 +1000
        From: Desmond Schmidt <desmond.allan.schmidt at gmail.com>
        Subject: Re:  26.581 XML & scholarship
        In-Reply-To: <20121214060225.AE38E2D96 at digitalhumanities.org>

> Date: Thu, 13 Dec 2012 22:26:00 -0500
> From: Doug Reside <dougreside at gmail.com>
> Subject: Re:  26.577 Folger Digital Texts --> XML & scholarship

> > But then I think about all of the attempts I and others have made to
> > create "easy to use" XML editors that end up being less functional and
> > harder to use than a simple text editor.  Anyone with a modicum of web
> > design experience who has tried to edit HTML in WordPress or Drupal
> > usually starts hunting for the "edit source" button immediately.  It
> > feels like there SHOULD be a better kind of data entry tool for
> > text-encoding than an angle bracket editor, but I'm not yet sure what
> > it is.

I'm glad that someone else recognises the difficulty of this problem.
It seems like it ought to be possible to build a graphical editor for
TEI-XML, but with 544 or more tags it's impossible to translate all the
structures that humanists want to record and represent them all
graphically. Simple textual highlighting works, sure, paragraph
structures work, but variants, virtual joins, footnotes, links, etc etc?
Since you have to represent many tags as raw XML what happens if the
user makes a mistake? You'd have to handle that error right there in
your online editor, not when the text is sent to the server. You'd have
to provide context-sensitive editing, hundreds of pages of explanations
as to what each tag signifies, and explain to the user how to fix each
mistake. Not a simple task to program, and certainly not a simple task
to use it.

The user need to have a simple editor cannot be met by XML. You
have to think beyond it, and I believe a consensus is now emerging in
the digital humanities that at least the properties of text (NOT its
versions) can be practically represented as overlapping ranges. There
are quite a few projects now exploring this line of research: eComma,
CATMA, LMNL, our own standoff properties. It's not rocket science. It's
very simple, and it works. Check out our website austese.net/tests/.
Everything you see here is done without XML, from the server to the
visualisations, comparisons, everything. The only thing that handles
XML are the import tools, of course. So I don't believe that XML is
actually needed any more to get our work done.

Desmond Schmidt
eResearch Lab
University of Queensland

        Date: Fri, 14 Dec 2012 09:53:13 -0500
        From: Wendell Piez <wapiez at wendellpiez.com>
        Subject: Re: [Humanist] 26.581 XML & scholarship
        In-Reply-To: <20121214060225.AE38E2D96 at digitalhumanities.org>

Dear Desmond,

On Fri, Dec 14, 2012 at 1:02 AM, Humanist Discussion Group
<willard.mccarty at mccarty.org.uk> wrote:
> I am likewise unconvinced by Wendell's argument
> that tinkering with XML or something like it is a user requirement for
> digital humanists.

Actually I wouldn't put it like this: it puts the cart of tinkering in
front of the horse of scholarly requirements. Rather, I'd say that I
don't think all of the requirements of scholars working with digital
media can be addressed without tinkering (either by themselves or by
others addressing their needs), for the foreseeable future and
possibly intrinsically. (I haven't decided if digital media per se
demand this, due to the generality of algorithmic processes, or
whether they're more like, say, print media, and that means and
methods will stabilize after a few generations of development. Maybe
both potentials are there. This is of course Willard's question in
that other thread.) This is precisely because I don't think that any
technology can be aligned preemptively with all the needs of scholars.

So it's not that scholars must be tinkerers, but that scholarship as a
whole will not advance (on a digital platform) without some scholars
sometimes tinkering (or at any rate, being party to tinkering). A
black-box format that discourages tinkering will inevitably hinder or
prohibit some kinds of work even while enabling others. And yes, even
an openly specified and standard technology such as XML can become a
black box (at least for practical purposes) as it becomes more complex
and difficult to learn, and its applications more burdened by implicit
knowledge and "lore".

I know that at the far side of my argument is the idea that the
workman who has not invented his tool does not know how to use it. Yet
there is a way in which I believe that.


Wendell Piez | http://www.wendellpiez.com
XML | XSLT | electronic publishing
Eat Your Vegetables

More information about the Humanist mailing list