[Humanist] 24.810 events: XML; annotation

Humanist Discussion Group willard.mccarty at mccarty.org.uk
Tue Mar 22 07:27:01 CET 2011

                 Humanist Discussion Group, Vol. 24, No. 810.
         Centre for Computing in the Humanities, King's College London
                Submit to: humanist at lists.digitalhumanities.org

  [1]   From:    "Gibson, Matthew (msg2d)" <msg2d at eservices.virginia.edu>  (24)
        Subject: XML Workshop

  [2]   From:    Passarotti Marco Carlo <marco.passarotti at unicatt.it>      (74)
        Subject: Call for Papers: Workshop on Annotation of Corpora for
                Research inthe Humanities

        Date: Mon, 21 Mar 2011 11:02:33 -0400
        From: "Gibson, Matthew (msg2d)" <msg2d at eservices.virginia.edu>
        Subject: XML Workshop

XML Development: From Markup to Application
April 25-28, 2011, Washington, DC
Washington DC—The Association of Research Libraries (ARL) is pleased to offer an in-depth workshop focused on Web development with XML.

Taught by experienced XML instructors and developers Matthew Gibson, Director of Digital Programs at the Virginia Foundation for the Humanities at the University of Virginia, and Patrick Yott, Director of Library Technology Services at Northeastern University, this four-day workshop will explore XML with a specific focus on fundamentals of design, markup, and use. Participants will use XML and related technologies in the creation of a prototype digital publication.

Topics to be covered include:

 *   XML: What is it? And why should we care about it?
 *   Working with content models (primarily XML Schema and some Schematron) and methods of using them when constructing and validating XML
 *   Implementing methods of content transformation and delivery (using XSLT and XPath) so the XML we build can be delivered, read, and used in a variety of formats
 *   Utilizing Solr, a Lucene-based search server, and XSLT to deliver the final class project

Participants should have a basic familiarity and some experience with markup (e.g., HTML, some XML, etc.).
Event Details
Dates: Monday, April 25 – Thursday, April 28, 2011
Time: 9:00 a.m.–5:00 p.m.
Location: George Washington University Marvin Center, Washington, DC
Fee: $1,500
Register: by March 25, 2011, at http://www.arl.org/stats/statsevents/index.shtml.


Matthew Gibson
Director of Digital Programs
Virginia Foundation for the Humanities
p. 434.924.4531

        Date: Mon, 21 Mar 2011 12:01:08 +0100
        From: Passarotti Marco Carlo <marco.passarotti at unicatt.it>
        Subject: Call for Papers: Workshop on Annotation of Corpora for Research in the Humanities

---- Workshop on Annotation of Corpora for Research in the Humanities ----

The workshop on "Annotation of corpora for research in the Humanities" will be held on January 5, 2012 at the University of Heidelberg (Germany) (http://www.coli.uni-saarland.de/conf/ACRH10/).

Submissions are invited for papers, posters and demonstrations presenting high quality, previously unpublished research on the topics described below. Contributions should focus on results from completed as well as ongoing research, with an emphasis on novel approaches, methods, ideas, and perspectives, whether descriptive, theoretical, formal or computational. Proceedings will be published as a special issue of the 'Journal for Language Technology and Computational Linguistics' (JLCL: http://www.jlcl.org/). Publication will be online only.

The workshop will be co-located with the Tenth International Workshop on Treebanks and Linguistic Theories (TLT10), which will be held on January 6-7, 2012 (http://tlt10.cl.uni-heidelberg.de).


The workshop aims at building a tighter collaboration between people working in various areas of the Humanities (such as literature, philology, history etc.) and the research community involved in developing, using and making accessible annotated corpora.

Addressing topics related to annotated corpora for research in the Humanities is an interdisciplinary task, which involves corpus and computational linguists (mostly those working in literary computing), philologists, scholars in the Humanities and computer scientists. However, this interdisciplinarity is not fully realised yet. Indeed, philologists and scholars are not used to exploit NLP tools and language resources such as annotated corpora; in turn, computational linguists are more prone to develop language resources for NLP purposes only.

For instance, although many corpora that play a relevant role for research in Humanities are today available in digital format (theatrical plays, contemporary novels, critical literature, literary reviews etc.), only a few of them are linguistically tagged, while most still lack linguistic tagging at all. Historical corpora are also a case of special interest, since their creation demands a strong interplay between computational linguistics and more traditional scholarship. Over the past few years a number of historical annotated corpora have been started, among which are treebanks for Middle, Early Modern and Old English, Early New High German, Medieval Portuguese, Ugaritic, Latin, Ancient Greek and several translations of the New Testament into Indo-European languages. The experience of these ever-growing group of projects can provide many suggestions on the methodology as well as on the practice of interaction between literary studies, philology and corpus linguistics.

Moreover, we believe that a tighter collaboration between people working in the Humanities and the research community involved in developing annotated corpora is needed since, while annotating a corpus from scratch still remains a labor-intensive and time-consuming task, today this is simplified by intensively exploiting prior experience in the field.


To overcome the above mentioned issues, the workshop aims at covering a wide range of topics related to the annotation of corpora for research in the Humanities.
The topics to be addressed in the workshop include (but are not limited to) the following:

- specific issues related to the annotation of corpora for research in the Humanities
- annotated corpora as a basis for research in the Humanities
- diachronic, historical and literary annotated corpora
- use of annotated corpora for stylometrics and authorship attribution
- philological issues, like different readings, textual variants, apparatus, non-standard orthography and spelling variation
- annotation principles and schemes of corpora for research in the Humanities
- adaptation of NLP tools for older language varieties. Specific features of tools for accessing and retrieving annotated corpora to address various research topics in the Humanities


Gregory Crane (Tufts University, Boston, USA).


Deadlines: always midnight, UTC ('Coordinated Universal Time'), ignoring DST ('Daylight Saving Time'):
- Deadline for paper submission: September 22, 2011
- Notification of acceptance: October 28, 2011
- Final version of paper for workshop proceedings: November 18, 2011
- Workshop: January 5, 2012


We invite the submission of full papers describing original, unpublished research related to the topics of the workshop. Papers should not exceed 12 pages.
The language of the workshop is English, and all papers should also be submitted in well-checked English.

Papers should be submitted in PDF format only. Submissions have to be made via the EasyChair page of the workshop at: https://www.easychair.org/conferences/?conf=acrh2012. Please first register at EasyChair if you do not have an EasyChair account.

The style guidelines follow the specifications required by JLCL. They can be found here: http://www.jlcl.org/index.php?modus=style_sheets&language=en.
Please note that as reviewing will be double-blind, the papers should not include the authors' names and affiliations or any references to web-sites, project names etc. revealing the authors' identity. Furthermore, any self-reference should be avoided. For instance, instead of "We previously showed (Brown, 2001)...", use citations such as "Brown previously showed (Brown, 2001)...". Each submitted paper will be reviewed by three members of the program committee.

Submitted papers can be for oral or poster presentations. There is no difference between the different kinds of presentation both in terms of reviewing process and publication in the proceedings.


The oral presentations at the workshop will be 30 minutes long (25 minutes for presentation and 5 minutes for questions and discussion).


- Francesco Mambrini (Tufts University, Boston; Univ. Cattolica del Sacro Cuore, Milan, Italy)
- Marco Passarotti (Università Cattolica del Sacro Cuore, Milan, Italy)
- Caroline Sporleder (Saarland University, Saarbrücken, Germany)


- Lars Borin, Sweden
- Milena Dobreva, Scotland
- Anette Frank, Germany
- Jost Gippert, Germany
- Erhard Hinrichs, Germany
- Anke Luedeling, Germany
- Willard McCarty, UK
- Alexander Mehler, Germany
- Adam Przepiórkowski, Poland
- Paul Rayson, UK
- Roman Schneider, Germany
- Raymond Siemens, Canada
- Manfred Stede, Germany
- Angelika Storrer, Germany
- Martin Volk, Switzerland


- Anette Frank
- Markus Kirschner
- Christoph Mayer
- Madeline Remse
- Corinna Schwarz
- Anke Sopka
All Heidelberg University

More information about the Humanist mailing list