[Humanist] 26.116 data-mining

Humanist Discussion Group willard.mccarty at mccarty.org.uk
Sat Jun 30 01:13:13 CEST 2012

                 Humanist Discussion Group, Vol. 26, No. 116.
            Department of Digital Humanities, King's College London
                Submit to: humanist at lists.digitalhumanities.org

        Date: Thu, 28 Jun 2012 16:29:40 +0100
        From: Richard Lewis <richard.lewis at gold.ac.uk>
        Subject: Re: [Humanist] 26.86 data mining?
        In-Reply-To: <20120613204802.92845281BD3 at woodward.joyent.us>

At Wed, 13 Jun 2012 15:06:22 +1000,
Willard McCarty wrote:

> In a forthcoming article in Interdisciplinary Science Reviews, the 
> author says that,
> > By data mining, I mean the activity of fitting a wide variety of
> > models to the data in the opportunistic hope of finding one that
> > fits well.
> But if this forthcoming article has nailed an important meaning of
> the term, then it would be good to have some description of how such
> mining is done.

This definition borrows terminology common in the machine
learning/statistical learning literature. A model is a way of mapping
the inputs (independent variables) to the outputs (dependent
variables) and can be expressed in software in a variety of ways
(including Baysian classifiers, decision trees, support vector
machines, and neural networks). And fitting is the process by which a
model is tuned to give the correct output for a given input. The
description of the process as "opportunistic" is a little pessimistic
(or perhaps tongue-in-cheek). In fact there are well-established
methods (learning algorithms) for fitting models to data in both
supervised and unsupervised learning (such as least squares,
clustering, genetic algorithms) and for evaluating the resulting model
(such as cross-validation).

Richard Lewis
ISMS, Computing
Goldsmiths, University of London
t: +44 (0)20 7078 5134
j: ironchicken at jabber.earth.li
@: lewisrichard
s: richardjlewis

More information about the Humanist mailing list