Mon Jul 25 07:52:37 CEST 2016

        Date: Sun, 24 Jul 2016 12:18:03 +0200
        From: Laura Dietz
        Subject: Re: 30.198 tools
Dear Humanists,

As a computer scientist, I also find it unfortunate if DH research
discussions evolve around tools rather than methodologies. Nearly all
tools have an underlying (CS) methodology - and this methodology might
be wrong.

In computer science we develop tools that, given certain inputs (e.g.
text), try to achieve the "best" outputs (e.g. topics). The tool is
developed based on a certain assumption of what "best" means. To take an
arbitrary example, say topic model tools, the CS assumption is that:
"Best" topics are formed whenever words of the same group occur together
in many documents.

This assumption is clearly wrong - but some people found it useful in
practise. Sometimes it works, sometimes it doesn't. As a consequence,
results need to be double-checked by humanities and computer science
researchers in collaboration to make sure that "best" outputs are
actually achieved, that everyone is aware of error modes, and so that
better tools can be developed.

This argument holds for pretty much all tools I know of. This includes
web search, entity linking, named entity tagging, even the graph layout
algorithms provided by Gephi. All successful tools are based on
approximative assumptions (unless we have the perfect methodologies for
human behavior). Well, manual exploration without computers also has
many flaws.

Currently, computer scientists develop tools based on one methodology,
then humanists apply the tools to derive a different methodology. It is
problematic if these methodologies do not match. Humanists have a better
insight than computer scientists into what "best" means. Rather than
believing that the tool will do the right thing, it would be very useful
to have this insight communicated back so that computer scientists can
develop better tools. Both sides need to be aware of the underlying
methodologies and assumptions made for successful research.

In the future, I hope that sentences such as
    "we apply tool X on data Y"
are replaced with
    "from data Y we obtain results based on the [CS assumption here] as
this is well aligned with [DH assumption here], because ..."

If you have any questions about underlying assumptions and methodologies
made by your favorite tool, please send me an email off-list. I am happy
to help.

A nice day to all of you, too!


On 07/24/2016 08:47 AM, Humanist Discussion Group wrote:
Date: Sat, 23 Jul 2016 12:36:19 -0400
From: Molly Des Jardin
Subject: Re: 30.195 tools
In-Reply-To: <20160722063057.0FB457B6D at digitalhumanities.org>
> I agree with Andrea. I find it more useful to think about
> methodologies than tools. In my past projects I've had to "roll my
> own" quite frequently, and I'd rather hear about the method behind
> someone's implementation - regardless of whether it's self-made or an
> off-the-shelf tool - than just "I used Gephi to make this
> visualization." I'm not saying that in general that's where the
> conversation ends, but I have sat through more than one presentation
> that doesn't explain what's going on in the implementation or the why
> of using it, and that's frustrating.
> Molly

