[Humanist] 26.246 big data

Humanist Discussion Group willard.mccarty at mccarty.org.uk
Sat Aug 25 10:24:33 CEST 2012

                 Humanist Discussion Group, Vol. 26, No. 246.
            Department of Digital Humanities, King's College London
                Submit to: humanist at lists.digitalhumanities.org

  [1]   From:    Wendell Piez <wapiez at mulberrytech.com>                   (101)
        Subject: Re: [Humanist] 26.241 buying a bill of goods

  [2]   From:    amsler at cs.utexas.edu                                      (40)
        Subject: Re: [Humanist] 26.241 buying a bill of goods

        Date: Fri, 24 Aug 2012 16:17:36 -0400
        From: Wendell Piez <wapiez at mulberrytech.com>
        Subject: Re: [Humanist] 26.241 buying a bill of goods
        In-Reply-To: <20120823062436.B50BB2855E2 at woodward.joyent.us>

Dear Willard,

On 8/23/2012 2:24 AM, Humanist Discussion Group wrote:
> Allow me to pose a question about where rhetoric leaves off and
> matters (we think) of substance begin. I quote from an e-mail message
> that just popped in:
>> The phenomena, known as big data is enabling organisations to get the
>> right answers to their biggest questions-faster.
>> But big data tools are not, in themselves a ‘magic bullet’ and there
>> will be challenges along the way that will threaten to derail any
>> initiative before it even gets off the ground.  For a start,
>> justifying the investment in big data isn't easy. Many will struggle
>> to define a business case, and the structural, people and process
>> changes that will likely occur with a big data effort need to be
>> managed very carefully to ensure success.
>> Learn how, by hearing from the early adopters and key innovators
>> leading the way in big data and advanced analytics at the 3rd Big
>> Data Insight Group Forum on 13 September 2012.
> The above can be made considerably more genteel, considerably
> more like what one of us might write. It's trivial to find the markers of
> ignorance in the above text, the mistakes in punctuation, grammatical
> number and so on, and trivial to find the buzz-words and to adjust
> for the humanities. These are errors and infelicities that can easily
> be corrected, e.g. thus:
>> The phenomenon known as "big data" enables us to make significant
>> progress with our most difficult questions, and to do this faster than
>> before.
>> But big data collections and tools are not sufficient in themselves.
>> Before they can be provided and put to use, challenges are likely to
>> threaten any project before it even begins. For a start,
>> justifying the investment in big data isn't easy. The changes in
>> institutional structures and processes and the new abilities required
>> from those involved need to be managed very carefully to ensure
>> a good outcome.
>> Learn how such an outcome may be secured by hearing from early
>> adopters and key innovators leading the way in big data and
>> advanced analytics at the 3rd Big Data Insight Group Forum on
>> 13 September 2012.
> One could go further. But even so, the bill of goods being sold here
> remains undisturbed and, I think if we're honest with ourselves,
> could readily be encountered in an academic context.
> What's assumed rather than asked? What "critical thinking", as we
> call it, is missing?

I love your observation and of course I much prefer your rewritten 
version. Like you, I had to ask myself what sort of contraption 
("initiative") was going to get off the ground before it was derailed. 
(Maybe one of those rocket ships that takes off from a launch track.)

But I hesitate to agree with your implication that anything much like 
critical thinking is even warranted here. This isn't being written for 
us; it isn't even being written for someone who needs to be sold on "Big 
Data". Instead, it's written for someone who has already decided this 
topic is interesting and maybe important (for their career if nothing 
else), that it holds promise, and who only needs something they can take 
to their manager to make the case that a junket (which after all will 
also be a learning opportunity) will be worth dipping into the budget 
for. To them, the sale has already been made, and the bill of goods is 
only so they have a piece of paper that can be stamped for approval.

Indeed, look carefully here and you will see the core of the promise is 
the acknowledgement that "Many will struggle to define a business case", 
which is subtly and importantly different from your rephrasing 
("justifying the investment in big data isn't easy"). The audience 
doesn't want to justify an investment: they want to make a business 
case, and the implication is that those who come will be able to fashion 
one, while those who don't, will only struggle. Why is it important to 
have a business case, as opposed to being able to justify an investment? 
Because a business case is enough to meet the need for a business case 
(in an environment where just having one is enough), while being able to 
justify an investment only opens a discussion.

I dare say that in this context, the correct amount of critical thinking 
will signal engagement with the topic while taking care to leave the 
basic premise, as you say, undisturbed. So questioning anyone's ideas is 
not part of the exercise. Thus, it is fine to suggest that a big 
investment in Big Data brings risks with it, since we have already 
accepted that: after all, I am an important person, who takes risks, and 
risks can pay off or I wouldn't want to see myself as a risk-taker. 
(What wouldn't be fine would be to say "we are going to ask whether this 
entire enterprise isn't a big waste of time and money".)

(FWIW, I don't believe it inevitably is in this case, although I 
continue to tell my friends I am only interested in Tiny Data. "Tiny 
Data!" says one of my Big Data friends. "You are only interested if the 
second one is different!" Exactly.)


Wendell Piez                            mailto:wapiez at mulberrytech.com
Mulberry Technologies, Inc.                http://www.mulberrytech.com
17 West Jefferson Street                    Direct Phone: 301/315-9635
Suite 207                                          Phone: 301/315-9631
Rockville, MD  20850                                 Fax: 301/315-8285
   Mulberry Technologies: A Consultancy Specializing in SGML and XML

        Date: Fri, 24 Aug 2012 18:54:03 -0500
        From: amsler at cs.utexas.edu
        Subject: Re: [Humanist] 26.241 buying a bill of goods
        In-Reply-To: <20120823062436.B50BB2855E2 at woodward.joyent.us>

The assumption of course is that Big Data contains undiscovered  
beneficial knowledge that is impossible to obtain by any previously  
available means. It attempts to collect funding for Big Data projects,  
funding which presumably has to be newly appropriated for this task,  
on the belief that it will yield answers that not only will justify  
the funds, but could lead to a reduction in the cost of operations.

In this regard Big Data is very much like the older buzzword of of  
'computerizing' something. That by putting operations into the  
computer there will be benefits both in efficiency and cost savings to  
justify the new expense. (In actuality, the benefits of  
'computerizing' operations is to allow changes in operations that  
would never have been approved except for the fact that the computer  
has different requirements for data entry than people; allowing the  
system to change despite entrenched social beliefs as to how things  
should be done.)

The use of 'Big Data' will thus be seized upon as a means of obtaining  
funding which isn't available otherwise. Because it is a technical  
methodology, it bypasses criticisms that it involves paying for human  
judgement. It's the computer that will make the discoveries, not human  
beings; therefore the discoveries will be objective and not subjective.

Big Data is at this point being proposed blindly as offering solutions  
when in most cases it is only offering a research project. To offer  
solutions it would have to be the case that similar Big Data projects  
in the same field can be shown to have provided benefits.

This will invariably lead to the next stage of Big Data failures, when  
the funders realize that there aren't financial benefits to some Big  
Data applications. Big Data tasks will then be separated into those  
that have worked and those that haven't worked. When it is realized  
that what was being funded was 'research' into the utility of Big Data  
for a given task.

And the cycle will advance to the next buzzword.

Now, I should also note that there WILL be Big Data successes. One  
cannot discount every application of a new methodology--it is just  
that funders should be aware that they are in many cases being asked  
to fund research not development. I like research. It is fun and leads  
to new discoveries. Just don't confuse 'research' with operational  
improvements. You can't guarantee research will save money. Buggy whip  
manufacturers couldn't have saved their industry by using better data  
analysis of their customer's purchases... if anything they would have  
realized they should cease manufacturing sooner.

More information about the Humanist mailing list