Last CfP and Deadline Extension: Second Workshop on Language technology for Digital Historical Archives (LTDHA)
Please appologize for multiple postings **** Second Workshop on Language technology for Digital Historical Archives With a special focus on Central-, (South-)Eastern Europe, Middle East and North Africa https://www.inf.uni-hamburg.de/inst/dmp/hercore/publications/ltdha.html in conjunction with the 12th biennial Recent Advances in Natural Language Processing conference (RANLP 2019), , Varna, Bulgaria http://lml.bas.bg/ranlp2019/start.php WORKSHOP DATE: September 5, 2019 Last Call for Papers SUBMISSION DEADLINE EXTENSION 21 .07.2019 Motivation During the last decades Digital Humanities evolved dramatically, from simple database applications to complex systems involving most recent state-of-the art in Computer Science. Especially Language Technology plays a major role either for processing the metadata of recorded objects or for analyzing and interpreting content. Applying Language Technology methods to objects from humanities in general and historical archives in particular, is a challenge for NLP-related research: data is heterogeneous (image /text), often incomplete (e.g. OCR errors), multilingual within one document (historic documents with Latin or/and classical Greek paragraphs) and difficult to structure (paragraphs, titles, pages are somewhat different in historical texts). Corpus-based methods, nowadays standard in NLP research, often cannot be applied as the necessary large training data is missing. Moreover, requirements for tools in Digital Humanities, especially tools dedicated to cultural heritage objects, are different from the ones applied to modern texts. Thus, performing research in Digital Humanities involves also: adapting existent NLP tools to the historical variants of languages; developing tools for new languages; making tools robust to syntactic deviation; and adapting semantic resources. Central and Eastern Europe as well as the Middle East and North Africa were always characterized by a high concentration of languages and cultures, interacting with each other. On a relatively small area texts written with at least 10 alphabets (Arabic, Hebrew, Armenian, Georgian, Greek, Cyrillic, Geez, Syriac and Latin, Coptic) can be found. On the other hand, information within these texts is important beyond the borders of a given language or script. (e.g. often documents in Ge'ez are translations of lost Coptic or ancient Greek texts). Places, Persons, Events have language-dependent denominations but refer to the same individual or geographical location. Unfortunately, especially in this area many historical documents are in bad condition; many languages or dialects became extinct over the time and their written evidence is rare. Digital methods seem the perfect means for preservation and investigation of this rich cultural heritage asset. However, up to now, concentrated activities seem to be absent, probably also due to the lack of adequate NLP resources and tools. Thus, it is very necessary to evaluate existent technology, monitor current activities, network research teams in this area - all aims of this workshop This is the second edition of Language technology for Digital Humanities in Central and (South-)Eastern Europe workshop, held in 2017 at RANLP. In the 2019 International Year of Indigenous Languages this edition expands also to Middle East and North Africa. Topics Corpora of diachronic variants and language dialects, NLP Tools for processing historical documents, Intelligent search in digital archives, (Semi-) Automatic (meta)Annotation of historical texts, Treating uncertain and vague information from historical documents, Ontologies for historical texts, Evaluation of current frameworks (CLARIN, DARIAH) on DH-objects related to historical texts; Machine learning approaches for under-resourced DH objects, Methods for dealing with incomplete specified objects (e.g. partially known features or values), Automatic extraction of metadata, Metadata Interoperability for digital objects Intelligent search in digital historical archives Geo- and Time References in historical documents focusing on languages from the above mentioned area. Submissions =========== Please submit your paper through the START system at: https://www.softconf.com/ranlp2019/LTDHA/ The reviewing process is anonymous. Double submission is allowed, but authors will be asked to declare it at the time of submission. Long papers should be 8 pages long plus 2 extra pages for references. Short papers should be 4 pages long plus 2 extra pages for references. Accepted short papers will be presented either as short oral presentations or as posters. All submissions should be formatted using the ACL based stylesheets provided for RANLP (http://lml.bas.bg/ranlp2019/submissions.php#styles). Accepted papers will be published in the workshop proceedings and uploaded on the ACL Anthology. Important Dates: ================ Paper submission deadline (EXTENDED): July 21, 2019 Notification of acceptance: August 8, 2019 Camera-ready papers due: August 20, 2019 LT4DH-CEE Workshop: September 5, 2019 Organizing Committee Cristina Vertan, University of Hamburg, Germany Petya Osenova, Bulgarian Academy of Sciences, Bulgaria Dimitar Iliev, St. Kliment Ohridski University of Sofia Programme Committee (TBA) Martha Yifiru Abate, University of Addis Ababa Gabriel Bodard, Institute of Classical Studies, SAS, London Elie Damaoui, University of Balamand Antske Fokkens, Vrije Universiteit, Amsterdam Walther v. Hahn, University of Hamburg Vladislav Kubon, Charles University, Prague Preslav Nakov, Qatar University Maciej Ogrodniczuk, Polish Academy of Science Gabor Proszeky, Catholic University, Budapest Kiril Simov, Bulgarian Academy of Sciences Stefan Trausan, Politechnics University, Bucharest Valeria Vitale, Institute of Classical Studies, SAS, London
participants (1)
-
Cristina Vertan